Update README.md
Browse files
README.md
CHANGED
@@ -62,6 +62,8 @@ This page introduces the **Babel-83B-Chat** model
|
|
62 |
|
63 |
We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
|
64 |
|
|
|
|
|
65 |
|
66 |
## Evaluation
|
67 |
|
@@ -131,8 +133,6 @@ print(f"Response:\n {response[0]}")
|
|
131 |
| *Average* | 75.1 | 71.9 | 67.0 | **74.4** |
|
132 |
|
133 |
|
134 |
-
**Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
|
135 |
-
|
136 |
## Acknowledgement
|
137 |
We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.
|
138 |
|
|
|
62 |
|
63 |
We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
|
64 |
|
65 |
+
**Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
|
66 |
+
|
67 |
|
68 |
## Evaluation
|
69 |
|
|
|
133 |
| *Average* | 75.1 | 71.9 | 67.0 | **74.4** |
|
134 |
|
135 |
|
|
|
|
|
136 |
## Acknowledgement
|
137 |
We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.
|
138 |
|