Yiran0924 commited on
Commit
554afc9
·
verified ·
1 Parent(s): 3123e01

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -62,6 +62,8 @@ This page introduces the **Babel-9B-Chat** model
62
 
63
  We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
64
 
 
 
65
 
66
  ## Evaluation
67
 
@@ -130,7 +132,6 @@ print(f"Response:\n {response[0]}")
130
  | Flores-200 | 50.8 | 54.8 | 48.9 | 47.3 | 45.8 | **56.7** |
131
  | *Average* | 60.7 | 65.7 | 56.0 | 54.5 | 61.3 | **67.5** |
132
 
133
- **Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
134
 
135
  ## Acknowledgement
136
  We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.
 
62
 
63
  We primarily leverage open-source multilingual SFT training corpora and translated SFT training data. Specifically, we utilize WildChat, a dataset comprising 1 million user-ChatGPT conversations with over 2.5 million interaction turns. Additionally, we employ Everything Instruct Multilingual, an extensive Alpaca-instruct-formatted dataset covering a diverse range of topics.
64
 
65
+ **Note that results are achieved purely by leveraging publicly available datasets, showcasing the robust foundational performance of Babel base models. We believe that incorporating more SFT data across diverse types, domains, and formats, along with additional alignment data and preference tuning, will further enhance the chat version beyond its current capabilities.**
66
+
67
 
68
  ## Evaluation
69
 
 
132
  | Flores-200 | 50.8 | 54.8 | 48.9 | 47.3 | 45.8 | **56.7** |
133
  | *Average* | 60.7 | 65.7 | 56.0 | 54.5 | 61.3 | **67.5** |
134
 
 
135
 
136
  ## Acknowledgement
137
  We would like to thank Guanzheng Chen for assisting with the implementation of the training codebase. Our special thanks go to our professional and native linguists—Tantong Champaiboon, Nguyen Ngoc Yen Nhi, and Tara Devina Putri—who contributed to building, evaluating, and fact-checking our sampled pretraining dataset. We also appreciate Fan Wang, Jiasheng Tang, Xin Li, and Hao Zhang for their efforts in coordinating computing resources.