add training recipe
Browse files
README.md
CHANGED
|
@@ -3,9 +3,12 @@ license: apache-2.0
|
|
| 3 |
---
|
| 4 |
# Model Card for Zamba2-2.7B-Instruct
|
| 5 |
|
| 6 |
-
Zamba2-2.7B-Instruct is obtained from [Zamba2-2.7B](https://huggingface.co/Zyphra/Zamba2-2.7B) by fine-tuning on instruction-following and chat datasets.
|
| 7 |
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
## Quick start
|
| 11 |
|
|
|
|
| 3 |
---
|
| 4 |
# Model Card for Zamba2-2.7B-Instruct
|
| 5 |
|
| 6 |
+
Zamba2-2.7B-Instruct is obtained from [Zamba2-2.7B](https://huggingface.co/Zyphra/Zamba2-2.7B) by fine-tuning on instruction-following and chat datasets. Specifically:
|
| 7 |
|
| 8 |
+
1. SFT of the base [Zamba2-2.7B](https://huggingface.co/Zyphra/Zamba2-2.7B) model on [ultrachat_200k](HuggingFaceH4/ultrachat_200k) and [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct)
|
| 9 |
+
2. DPO of the SFT checkpoint on [ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized), [orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs), and [OpenHermesPreferences](https://huggingface.co/datasets/argilla/OpenHermesPreferences)
|
| 10 |
+
|
| 11 |
+
Zamba2-2.7B-Instruct is a hybrid model composed of state-space ([Mamba2](https://github.com/state-spaces/mamba)) and transformer blocks.
|
| 12 |
|
| 13 |
## Quick start
|
| 14 |
|