Tower-Babel
/

Babel-9B-Chat

Model card Files Files and versions Community

Yiran0924 commited on 9 days ago

Commit

554afc9

·

verified ·

1 Parent(s): 3123e01

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -62,6 +62,8 @@ This page introduces the **Babel-9B-Chat** model
 We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
 ## Evaluation
@@ -130,7 +132,6 @@ print(f"Response:\n {response[0]}")
 | Flores-200       | 50.8        | 54.8          | 48.9            | 47.3            | 45.8           | **56.7**     |
 | *Average*        | 60.7        | 65.7          | 56.0            | 54.5            | 61.3           | **67.5**     |
-**Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
 ## Acknowledgement
 We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.

 We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
+**Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
 ## Evaluation
 | Flores-200       | 50.8        | 54.8          | 48.9            | 47.3            | 45.8           | **56.7**     |
 | *Average*        | 60.7        | 65.7          | 56.0            | 54.5            | 61.3           | **67.5**     |
 ## Acknowledgement
 We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.