NNLB-alt-en-bleu-ht

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.4011
Bleu: 40.828
Gen Len: 26.385

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 14
eval_batch_size: 14
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
2.0036	1.0	799	1.4377	25.1487	25.036
1.2584	2.0	1598	1.3276	29.603	25.723
1.0147	3.0	2397	1.3204	31.3967	25.776
0.7379	4.0	3196	1.3678	32.4951	25.266
0.6228	5.0	3995	1.4250	34.6087	26.083
0.4327	6.0	4794	1.5342	36.6073	26.174
0.3437	7.0	5593	1.5952	37.7791	26.265
0.2689	8.0	6392	1.6993	38.16	26.376
0.2029	9.0	7191	1.7994	39.433	26.766
0.1711	10.0	7990	1.8893	39.2816	26.574
0.1214	11.0	8789	1.9661	39.5599	26.687
0.1017	12.0	9588	1.9928	39.7801	26.845
0.0855	13.0	10387	2.0508	39.8043	26.641
0.0679	14.0	11186	2.0998	40.3389	26.526
0.06	15.0	11985	2.1350	40.0964	26.395
0.0475	16.0	12784	2.1676	40.1536	26.614
0.0407	17.0	13583	2.2040	40.298	26.494
0.0347	18.0	14382	2.2294	40.5207	26.612
0.0315	19.0	15181	2.2484	40.3323	26.53
0.0286	20.0	15980	2.2828	40.3167	26.718
0.0241	21.0	16779	2.3015	40.0766	26.306
0.0213	22.0	17578	2.3267	40.477	26.457
0.0183	23.0	18377	2.3410	40.4013	26.406
0.0164	24.0	19176	2.3457	40.3643	26.534
0.0157	25.0	19975	2.3533	40.3967	26.506
0.0133	26.0	20774	2.3734	40.7786	26.38
0.0119	27.0	21573	2.3750	40.8653	26.525
0.0106	28.0	22372	2.3896	40.8371	26.503
0.0095	29.0	23171	2.3893	40.831	26.398
0.0094	30.0	23970	2.4011	40.828	26.385

Framework versions

Transformers 4.21.0
Pytorch 1.10.0+cu113
Datasets 2.4.0
Tokenizers 0.12.1

hou
/

test-translation-model

NNLB-alt-en-bleu-ht

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results