--- license: cc-by-nc-sa-4.0 language: - zh - en pipeline_tag: translation --- # Opus Tatoeba | Chinese -> English * dataset: opus * model: transformer * source language(s): cjy cmn gan hak hsn lzh nan wuu yue * target language(s): eng * model: transformer * pre-processing: normalization + SentencePiece (spm32k,spm32k) * download: [opus-2021-02-18.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/zho-eng/opus-2021-02-18.zip) * test set translations: [opus-2021-02-18.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/zho-eng/opus-2021-02-18.test.txt) * test set scores: [opus-2021-02-18.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/zho-eng/opus-2021-02-18.eval.txt) ## Benchmarks | testset | BLEU | chr-F | #sent | #words | BP | |---------|-------|-------|-------|--------|----| | Tatoeba-test.cjy_Hans-eng | 20.9 | 0.394 | 2 | 18 | 0.751 | | Tatoeba-test.cjy_Hant-eng | 16.0 | 0.173 | 1 | 4 | 1.000 | | Tatoeba-test.cmn-eng | 17.0 | 0.480 | 15 | 82 | 1.000 | | Tatoeba-test.cmn_Hans-eng | 36.7 | 0.558 | 4195 | 36106 | 0.935 | | Tatoeba-test.cmn_Hant-eng | 40.4 | 0.585 | 4418 | 34780 | 0.955 | | Tatoeba-test.gan-eng | 43.2 | 0.565 | 1 | 9 | 1.000 | | Tatoeba-test.lzh-eng | 3.6 | 0.191 | 98 | 933 | 0.755 | | Tatoeba-test.lzh_Hans-eng | 6.0 | 0.170 | 3 | 42 | 0.900 | | Tatoeba-test.nan-eng | 6.1 | 0.179 | 2 | 10 | 1.000 | | Tatoeba-test.wuu-eng | 17.4 | 0.400 | 203 | 1625 | 1.000 | | Tatoeba-test.yue_Hans-eng | 21.1 | 0.407 | 630 | 5838 | 0.946 | | Tatoeba-test.yue_Hant-eng | 25.2 | 0.428 | 431 | 3350 | 0.949 | | Tatoeba-test.zho-eng | 36.0 | 0.546 | 9999 | 82822 | 0.946 |