Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,68 @@
|
|
1 |
-
---
|
2 |
-
license: other
|
3 |
-
license_name: openmdw
|
4 |
-
license_link: LICENSE
|
5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: openmdw
|
4 |
+
license_link: LICENSE
|
5 |
+
---
|
6 |
+
# Seed-X-RM-7B
|
7 |
+
<a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf">
|
8 |
+
<img src="https://img.shields.io/badge/Seed--X-Report-blue"></a>
|
9 |
+
<a href="https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B">
|
10 |
+
<img src="https://img.shields.io/badge/Seed--X-Hugging Face-brightgreen"></a>
|
11 |
+
<a href="https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw">
|
12 |
+
<img src="https://img.shields.io/badge/License-OpenMDW-yellow"></a>
|
13 |
+
|
14 |
+
## Introduction
|
15 |
+
We are excited to introduce **Seed-X**, a powerful open-source multilingual translation language model series, including instruction and reasoning models, with 7B parameters pushing the boundaries of translation capabilities.
|
16 |
+
We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
|
17 |
+
* **Exceptional translation capabilities**: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
|
18 |
+
* **Deployment and inference-friendly**: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
|
19 |
+
* **Broad domain coverage**: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.
|
20 |
+

|
21 |
+
|
22 |
+
This repo contains the **Seed-X-RM** model, with the following features:
|
23 |
+
* Type: Causal language models
|
24 |
+
* Training Stage: Pretraining & Post-training
|
25 |
+
* Data Source: Human preference data on multilingual translation
|
26 |
+
* Support: Evaluating translation betweeen 28 languages
|
27 |
+
|
28 |
+
| Languages | Abbr. | Languages | Abbr. | Languages | Abbr. | Languages | Abbr. |
|
29 |
+
| ----------- | ----------- |-----------|-----------|-----------|-----------| -----------|-----------|
|
30 |
+
|Arabic | ar |French | fr | Malay | ms | Russian | ru |
|
31 |
+
|Czech | cs |Croatian | hr | Norwegian Bokmal | nb | Swedish | sv |
|
32 |
+
|Danish | da |Hungarian | hu | Dutch | nl | Thai | th |
|
33 |
+
|German | de |Indonesian | id | Norwegian | no | Turkish | tr |
|
34 |
+
|English | en |Italian | it | Polish | pl | Ukrainian | uk |
|
35 |
+
|Spanish | es |Japanese | ja | Portuguese | pt | Vietnamese | vi |
|
36 |
+
|Finnish | fi |Korean | ko | Romanian | ro | Chinese | zh |
|
37 |
+
|
38 |
+
## Model Downloads
|
39 |
+
| Model Name | Description | Download |
|
40 |
+
| ----------- | ----------- |-----------
|
41 |
+
| Seed-X-Instruct | Instruction-tuned for alignment with user intent. |🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)|
|
42 |
+
| Seed-X-PPO | RL trained to boost translation capabilities. | 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)|
|
43 |
+
| 👉 **Seed-X-RM** | Reward model to evaluate the quality of translation.| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)|
|
44 |
+
|
45 |
+
## Quickstart
|
46 |
+
Seed-X-RM assigns a reward score to the given translation with the same prompt format as Seed-X-PPO.
|
47 |
+
|
48 |
+
## Evaluation
|
49 |
+
We evaluated Seed-X on a diverse set of translation benchmarks, including FLORES-200, WMT-25, and a publicly released [challenge set](https://github.com/ByteDance-Seed/Seed-X-7B/tree/main/challenge_set) accompanied by human evaluations.
|
50 |
+

|
51 |
+
For detailed benchmark results and analysis, please refer to our [Technical Report](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/Technical_Report.pdf).
|
52 |
+
|
53 |
+
## License
|
54 |
+
This project is licensed under OpenMDW. See the [LICENSE](https://github.com/ByteDance-Seed/Seed-X-7B/blob/main/LICENSE.openmdw) flie for details.
|
55 |
+
|
56 |
+
## Citation
|
57 |
+
If you find Seed-X useful for your research and applications, feel free to give us a star ⭐ or cite us using:
|
58 |
+
```bibtex
|
59 |
+
@Article{XXX,
|
60 |
+
title={XXXXXXXXXXX},
|
61 |
+
author={XXX,XXX,XXX,XXX},
|
62 |
+
year={2025},
|
63 |
+
eprint={XXXX.XXXXX},
|
64 |
+
archivePrefix={arXiv},
|
65 |
+
primaryClass={cs.XX}
|
66 |
+
}
|
67 |
+
```
|
68 |
+
We will soon publish our technical report on Arxiv.
|