YuLu0713 commited on
Commit
2af2ccc
·
verified ·
1 Parent(s): 3936c78

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -5
README.md CHANGED
@@ -1,5 +1,68 @@
1
- ---
2
- license: other
3
- license_name: openmdw
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: openmdw
4
+ license_link: LICENSE
5
+ ---
6
+ # Seed-X-RM-7B
7
+ <a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf">
8
+ <img src="https://img.shields.io/badge/Seed--X-Report-blue"></a>
9
+ <a href="https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B">
10
+ <img src="https://img.shields.io/badge/Seed--X-Hugging Face-brightgreen"></a>
11
+ <a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw">
12
+ <img src="https://img.shields.io/badge/License-OpenMDW-yellow"></a>
13
+
14
+ ## Introduction
15
+ We are excited to introduce **Seed-X**, a powerful open-source multilingual translation language model series, including instruction and reasoning models, with 7B parameters pushing the boundaries of translation capabilities.
16
+ We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
17
+ * **Exceptional translation capabilities**: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
18
+ * **Deployment and inference-friendly**: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
19
+ * **Broad domain coverage**: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.
20
+ ![performance](imgs/model_comparsion.png)
21
+
22
+ This repo contains the **Seed-X-RM** model, with the following features:
23
+ * Type: Causal language models
24
+ * Training Stage: Pretraining & Post-training
25
+ * Data Source: Human preference data on multilingual translation
26
+ * Support: Evaluating translation betweeen 28 languages
27
+
28
+ | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. |
29
+ | ----------- | ----------- |-----------|-----------|-----------|-----------| -----------|-----------|
30
+ |Arabic | ar |French | fr | Malay | ms | Russian | ru |
31
+ |Czech | cs |Croatian | hr | Norwegian Bokmal | nb | Swedish | sv |
32
+ |Danish | da |Hungarian | hu | Dutch | nl | Thai | th |
33
+ |German | de |Indonesian | id | Norwegian | no | Turkish | tr |
34
+ |English | en |Italian | it | Polish | pl | Ukrainian | uk |
35
+ |Spanish | es |Japanese | ja | Portuguese | pt | Vietnamese | vi |
36
+ |Finnish | fi |Korean | ko | Romanian | ro | Chinese | zh |
37
+
38
+ ## Model Downloads
39
+ | Model Name | Description | Download |
40
+ | ----------- | ----------- |-----------
41
+ | Seed-X-Instruct | Instruction-tuned for alignment with user intent. |🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)|
42
+ | Seed-X-PPO | RL trained to boost translation capabilities. | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)|
43
+ | 👉 **Seed-X-RM** | Reward model to evaluate the quality of translation.| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)|
44
+
45
+ ## Quickstart
46
+ Seed-X-RM assigns a reward score to the given translation with the same prompt format as Seed-X-PPO.
47
+
48
+ ## Evaluation
49
+ We evaluated Seed-X on a diverse set of translation benchmarks, including FLORES-200, WMT-25, and a publicly released [challenge set](https://github.com/ByteDance-Seed/Seed-X-7B/tree/main/challenge_set) accompanied by human evaluations.
50
+ ![humen_eval](imgs/humen_eval.png)
51
+ For detailed benchmark results and analysis, please refer to our [Technical Report](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf).
52
+
53
+ ## License
54
+ This project is licensed under OpenMDW. See the [LICENSE](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw) flie for details.
55
+
56
+ ## Citation
57
+ If you find Seed-X useful for your research and applications, feel free to give us a star ⭐ or cite us using:
58
+ ```bibtex
59
+ @Article{XXX,
60
+ title={XXXXXXXXXXX},
61
+ author={XXX,XXX,XXX,XXX},
62
+ year={2025},
63
+ eprint={XXXX.XXXXX},
64
+ archivePrefix={arXiv},
65
+ primaryClass={cs.XX}
66
+ }
67
+ ```
68
+ We will soon publish our technical report on Arxiv.