kalese commited on
Commit
c2b0360
·
verified ·
1 Parent(s): 5e4f8cd

End of training

Browse files
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Helsinki-NLP/opus-mt-en-ro
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - arrow
8
+ metrics:
9
+ - bleu
10
+ model-index:
11
+ - name: opus-mt-en-bkm-Final-60
12
+ results:
13
+ - task:
14
+ name: Sequence-to-sequence Language Modeling
15
+ type: text2text-generation
16
+ dataset:
17
+ name: arrow
18
+ type: arrow
19
+ config: default
20
+ split: train
21
+ args: default
22
+ metrics:
23
+ - name: Bleu
24
+ type: bleu
25
+ value: 9.354
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # opus-mt-en-bkm-Final-60
32
+
33
+ This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ro](https://huggingface.co/Helsinki-NLP/opus-mt-en-ro) on the arrow dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 1.5584
36
+ - Bleu: 9.354
37
+ - Gen Len: 40.6029
38
+
39
+ ## Model description
40
+
41
+ More information needed
42
+
43
+ ## Intended uses & limitations
44
+
45
+ More information needed
46
+
47
+ ## Training and evaluation data
48
+
49
+ More information needed
50
+
51
+ ## Training procedure
52
+
53
+ ### Training hyperparameters
54
+
55
+ The following hyperparameters were used during training:
56
+ - learning_rate: 2e-05
57
+ - train_batch_size: 16
58
+ - eval_batch_size: 16
59
+ - seed: 42
60
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
61
+ - lr_scheduler_type: linear
62
+ - num_epochs: 10
63
+
64
+ ### Training results
65
+
66
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
67
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
68
+ | 3.4876 | 1.0 | 601 | 2.2031 | 2.3601 | 50.4173 |
69
+ | 2.2475 | 2.0 | 1202 | 1.9329 | 4.8587 | 40.8697 |
70
+ | 2.0162 | 3.0 | 1803 | 1.7959 | 5.9413 | 39.0495 |
71
+ | 1.8926 | 4.0 | 2404 | 1.7144 | 6.9033 | 40.8797 |
72
+ | 1.71 | 5.0 | 3005 | 1.6537 | 7.7651 | 40.4224 |
73
+ | 1.6501 | 6.0 | 3606 | 1.6161 | 8.441 | 41.3464 |
74
+ | 1.6053 | 7.0 | 4207 | 1.5869 | 8.812 | 40.534 |
75
+ | 1.554 | 8.0 | 4808 | 1.5725 | 9.2092 | 40.4813 |
76
+ | 1.5409 | 9.0 | 5409 | 1.5608 | 9.3966 | 40.9083 |
77
+ | 1.5049 | 10.0 | 6010 | 1.5584 | 9.354 | 40.6029 |
78
+
79
+
80
+ ### Framework versions
81
+
82
+ - Transformers 4.39.3
83
+ - Pytorch 2.2.1+cu121
84
+ - Datasets 2.18.0
85
+ - Tokenizers 0.15.2
generation_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bad_words_ids": [
3
+ [
4
+ 59542
5
+ ]
6
+ ],
7
+ "bos_token_id": 0,
8
+ "decoder_start_token_id": 59542,
9
+ "eos_token_id": 0,
10
+ "forced_eos_token_id": 0,
11
+ "max_length": 512,
12
+ "num_beams": 4,
13
+ "pad_token_id": 59542,
14
+ "renormalize_logits": true,
15
+ "transformers_version": "4.39.3"
16
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:aa25678d5a721a2896979d38be01f96a371d253a8b5fc7d490ff0ee913ee4d2f
3
  size 298765276
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4647621fcb31df4930153de676964a8e87c135ae377ee8a0d2d79816a512542
3
  size 298765276
runs/Apr02_12-44-41_46316c2715fa/events.out.tfevents.1712061953.46316c2715fa.595.2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:beeb37b2d4a572b42c795289e6aafef05e816f9e2f63800834747d9cbd4ba9c4
3
- size 11358
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:80a51be11e1c4199f916fc2ec7d435790aeade1552ba484440b73251e605cadc
3
+ size 12082