results

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1030
  • Sacrebleu: 36.1071
  • Bleu: 0.3611
  • Rouge1: 0.6827
  • Rouge2: 0.4557
  • Rougel: 0.6584
  • Rougelsum: 0.6574
  • Ter: 44.4372

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Sacrebleu Bleu Rouge1 Rouge2 Rougel Rougelsum Ter
9.9972 0.3534 50 9.4988 19.6720 0.1967 0.5861 0.3417 0.5513 0.5522 58.0351
7.8177 0.7067 100 6.7801 26.8668 0.2687 0.6116 0.3719 0.5788 0.5795 53.6109
4.1506 1.0565 150 3.1994 28.3941 0.2839 0.6266 0.3892 0.5946 0.5955 52.1145
1.2625 1.4099 200 0.7019 32.1449 0.3214 0.6506 0.4209 0.6206 0.6210 49.5771
0.1995 1.7633 250 0.1521 32.0543 0.3205 0.6496 0.4085 0.6208 0.6198 48.5361
0.0878 2.1131 300 0.1119 32.5852 0.3259 0.6653 0.4304 0.6378 0.6376 48.2759
0.0726 2.4664 350 0.1039 35.1243 0.3512 0.6755 0.4455 0.6478 0.6474 45.2180
0.0672 2.8198 400 0.1007 36.5523 0.3655 0.6749 0.4555 0.6538 0.6541 44.6975
0.0475 3.1696 450 0.1030 36.1071 0.3611 0.6827 0.4557 0.6584 0.6574 44.4372

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.1.2
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
35
Safetensors
Model size
611M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Strange18/results

Finetuned
(121)
this model