results

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 6
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Sacrebleu	Bleu	Rouge1	Rouge2	Rougel	Rougelsum	Ter
9.9972	0.3534	50	9.4988	19.6720	0.1967	0.5861	0.3417	0.5513	0.5522	58.0351
7.8177	0.7067	100	6.7801	26.8668	0.2687	0.6116	0.3719	0.5788	0.5795	53.6109
4.1506	1.0565	150	3.1994	28.3941	0.2839	0.6266	0.3892	0.5946	0.5955	52.1145
1.2625	1.4099	200	0.7019	32.1449	0.3214	0.6506	0.4209	0.6206	0.6210	49.5771
0.1995	1.7633	250	0.1521	32.0543	0.3205	0.6496	0.4085	0.6208	0.6198	48.5361
0.0878	2.1131	300	0.1119	32.5852	0.3259	0.6653	0.4304	0.6378	0.6376	48.2759
0.0726	2.4664	350	0.1039	35.1243	0.3512	0.6755	0.4455	0.6478	0.6474	45.2180
0.0672	2.8198	400	0.1007	36.5523	0.3655	0.6749	0.4555	0.6538	0.6541	44.6975
0.0475	3.1696	450	0.1030	36.1071	0.3611	0.6827	0.4557	0.6584	0.6574	44.4372