Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,34 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-nc-sa-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-sa-4.0
|
3 |
+
language:
|
4 |
+
- fi
|
5 |
+
- en
|
6 |
+
pipeline_tag: translation
|
7 |
+
---
|
8 |
+
|
9 |
+
# Opus Tatoeba | Finnish -> English
|
10 |
+
|
11 |
+
* dataset: opus
|
12 |
+
* model: transformer-align
|
13 |
+
* source language(s): fin
|
14 |
+
* target language(s): eng
|
15 |
+
* model: transformer-align
|
16 |
+
* pre-processing: normalization + SentencePiece (spm32k,spm32k)
|
17 |
+
* download: [opus-2021-02-18.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/fin-eng/opus-2021-02-18.zip)
|
18 |
+
* test set translations: [opus-2021-02-18.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/fin-eng/opus-2021-02-18.test.txt)
|
19 |
+
* test set scores: [opus-2021-02-18.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/fin-eng/opus-2021-02-18.eval.txt)
|
20 |
+
|
21 |
+
## Benchmarks
|
22 |
+
|
23 |
+
| testset | BLEU | chr-F | #sent | #words | BP |
|
24 |
+
|---------|-------|-------|-------|--------|----|
|
25 |
+
| newsdev2015-enfi.fin-eng | 25.3 | 0.536 | 1500 | 32104 | 1.000 |
|
26 |
+
| newstest2015-enfi.fin-eng | 26.9 | 0.547 | 1370 | 27356 | 0.997 |
|
27 |
+
| newstest2016-enfi.fin-eng | 29.0 | 0.571 | 3000 | 63043 | 1.000 |
|
28 |
+
| newstest2017-enfi.fin-eng | 32.3 | 0.594 | 3002 | 61936 | 0.997 |
|
29 |
+
| newstest2018-enfi.fin-eng | 23.8 | 0.517 | 3000 | 62325 | 0.991 |
|
30 |
+
| newstest2019-fien.fin-eng | 29.0 | 0.565 | 1996 | 36227 | 1.000 |
|
31 |
+
| newstestB2016-enfi.fin-eng | 24.5 | 0.527 | 3000 | 63043 | 0.999 |
|
32 |
+
| newstestB2017-enfi.fin-eng | 27.4 | 0.557 | 3002 | 61936 | 1.000 |
|
33 |
+
| newstestB2017-fien.fin-eng | 27.4 | 0.557 | 3002 | 61936 | 1.000 |
|
34 |
+
| Tatoeba-test.fin-eng | 53.4 | 0.697 | 10000 | 74651 | 0.990 |
|