metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - iva_mt_wslot
metrics:
  - bleu
model-index:
  - name: iva_mt_wslot-m2m100_418M-en-pt
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: iva_mt_wslot
          type: iva_mt_wslot
          config: en-pt
          split: validation
          args: en-pt
        metrics:
          - name: Bleu
            type: bleu
            value: 67.0512

iva_mt_wslot-m2m100_418M-en-pt

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the evaluation set:

Loss: 0.0119
Bleu: 67.0512
Gen Len: 20.3665

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 7
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.016	1.0	1842	0.0132	62.2701	20.1343
0.0103	2.0	3684	0.0117	65.7139	20.2191
0.0076	3.0	5526	0.0116	65.578	20.0926
0.0059	4.0	7368	0.0115	66.3728	20.4514
0.0043	5.0	9210	0.0117	65.8861	20.3781
0.0033	6.0	11052	0.0117	66.6496	20.4383
0.0026	7.0	12894	0.0119	67.0512	20.3665

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.11.0
Tokenizers 0.13.3