Final-Nepali-to-English

This model is a fine-tuned version of [Nepali-to-English) on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0186
  • Bleu: 30.3843
  • Gen Len: 73.8617

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Bleu Gen Len Validation Loss
1.1924 0.0561 12000 26.1232 74.765 1.3707
1.0978 0.1122 24000 26.0469 78.0433 1.2622
1.0316 0.1684 36000 28.0954 73.6692 1.2075
0.9993 0.2245 48000 28.1497 75.1633 1.1745
0.9651 0.2806 60000 29.1482 75.9908 1.1428
0.9415 0.3367 72000 27.0537 82.6383 1.1183
0.9251 0.3928 84000 27.637 79.2592 1.0864
0.9008 0.4489 96000 29.0405 76.6583 1.0683
0.8726 0.5051 108000 29.923 75.4483 1.0494
0.8701 0.5612 120000 29.2328 77.2858 1.0316
0.8546 0.6173 132000 29.6585 76.1308 1.0185
0.8392 0.6734 144000 30.5079 78.0417 1.0072

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3
Downloads last month
22
Safetensors
Model size
77.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using BeebekBhz/Final-ne-en 1