File size: 3,533 Bytes
45a76a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: cc-by-nc-4.0
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: NNLB-alt-en-bleu-ht
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# NNLB-alt-en-bleu-ht

This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 2.4011
- Bleu: 40.828
- Gen Len: 26.385

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 14
- eval_batch_size: 14
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| 2.0036        | 1.0   | 799   | 1.4377          | 25.1487 | 25.036  |
| 1.2584        | 2.0   | 1598  | 1.3276          | 29.603  | 25.723  |
| 1.0147        | 3.0   | 2397  | 1.3204          | 31.3967 | 25.776  |
| 0.7379        | 4.0   | 3196  | 1.3678          | 32.4951 | 25.266  |
| 0.6228        | 5.0   | 3995  | 1.4250          | 34.6087 | 26.083  |
| 0.4327        | 6.0   | 4794  | 1.5342          | 36.6073 | 26.174  |
| 0.3437        | 7.0   | 5593  | 1.5952          | 37.7791 | 26.265  |
| 0.2689        | 8.0   | 6392  | 1.6993          | 38.16   | 26.376  |
| 0.2029        | 9.0   | 7191  | 1.7994          | 39.433  | 26.766  |
| 0.1711        | 10.0  | 7990  | 1.8893          | 39.2816 | 26.574  |
| 0.1214        | 11.0  | 8789  | 1.9661          | 39.5599 | 26.687  |
| 0.1017        | 12.0  | 9588  | 1.9928          | 39.7801 | 26.845  |
| 0.0855        | 13.0  | 10387 | 2.0508          | 39.8043 | 26.641  |
| 0.0679        | 14.0  | 11186 | 2.0998          | 40.3389 | 26.526  |
| 0.06          | 15.0  | 11985 | 2.1350          | 40.0964 | 26.395  |
| 0.0475        | 16.0  | 12784 | 2.1676          | 40.1536 | 26.614  |
| 0.0407        | 17.0  | 13583 | 2.2040          | 40.298  | 26.494  |
| 0.0347        | 18.0  | 14382 | 2.2294          | 40.5207 | 26.612  |
| 0.0315        | 19.0  | 15181 | 2.2484          | 40.3323 | 26.53   |
| 0.0286        | 20.0  | 15980 | 2.2828          | 40.3167 | 26.718  |
| 0.0241        | 21.0  | 16779 | 2.3015          | 40.0766 | 26.306  |
| 0.0213        | 22.0  | 17578 | 2.3267          | 40.477  | 26.457  |
| 0.0183        | 23.0  | 18377 | 2.3410          | 40.4013 | 26.406  |
| 0.0164        | 24.0  | 19176 | 2.3457          | 40.3643 | 26.534  |
| 0.0157        | 25.0  | 19975 | 2.3533          | 40.3967 | 26.506  |
| 0.0133        | 26.0  | 20774 | 2.3734          | 40.7786 | 26.38   |
| 0.0119        | 27.0  | 21573 | 2.3750          | 40.8653 | 26.525  |
| 0.0106        | 28.0  | 22372 | 2.3896          | 40.8371 | 26.503  |
| 0.0095        | 29.0  | 23171 | 2.3893          | 40.831  | 26.398  |
| 0.0094        | 30.0  | 23970 | 2.4011          | 40.828  | 26.385  |


### Framework versions

- Transformers 4.21.0
- Pytorch 1.10.0+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1