mbart-large-50-many-to-many-mmt-finetuned-fij_Latn-to-eng_Latn
This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.9598
- Bleu: 45.0972
- Gen Len: 42.752
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 6
- total_train_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 8000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
2.4054 | 0.49 | 500 | 1.7028 | 24.9597 | 43.04 |
1.6855 | 0.98 | 1000 | 1.3701 | 33.3128 | 42.2 |
1.4042 | 1.47 | 1500 | 1.2224 | 37.6016 | 43.536 |
1.2991 | 1.96 | 2000 | 1.1467 | 40.3541 | 42.428 |
1.1819 | 2.45 | 2500 | 1.0950 | 42.2106 | 42.58 |
1.1323 | 2.94 | 3000 | 1.0523 | 42.9418 | 42.76 |
1.0676 | 3.43 | 3500 | 1.0238 | 43.4974 | 42.684 |
1.0404 | 3.93 | 4000 | 1.0082 | 43.6092 | 42.616 |
0.9882 | 4.42 | 4500 | 0.9942 | 44.7199 | 42.912 |
0.982 | 4.91 | 5000 | 0.9814 | 44.8061 | 42.516 |
0.9372 | 5.4 | 5500 | 0.9781 | 44.3808 | 42.476 |
0.9382 | 5.89 | 6000 | 0.9675 | 45.0267 | 42.76 |
0.915 | 6.38 | 6500 | 0.9659 | 45.0073 | 42.676 |
0.9126 | 6.87 | 7000 | 0.9617 | 44.9582 | 42.548 |
0.8903 | 7.36 | 7500 | 0.9609 | 44.8713 | 42.724 |
0.8873 | 7.85 | 8000 | 0.9598 | 45.0972 | 42.752 |
Framework versions
- Transformers 4.21.3
- Pytorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.