Terjman-Nano-v2.1-512
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.2875
- Bleu: 2.1295
- Gen Len: 10.2765
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
4.1858 | 0.2804 | 1000 | 4.9872 | 1.1697 | 9.3165 |
3.7672 | 0.5609 | 2000 | 4.5251 | 1.6917 | 11.1082 |
3.5203 | 0.8413 | 3000 | 4.4067 | 1.779 | 10.5894 |
3.6506 | 1.1217 | 4000 | 4.3579 | 1.8055 | 11.7212 |
3.4325 | 1.4021 | 5000 | 4.3266 | 1.8151 | 10.4882 |
3.4966 | 1.6826 | 6000 | 4.3114 | 1.8294 | 10.52 |
3.4795 | 1.9630 | 7000 | 4.3022 | 2.0241 | 10.5553 |
3.5567 | 2.2434 | 8000 | 4.2977 | 2.0571 | 10.3271 |
3.6008 | 2.5238 | 9000 | 4.2954 | 2.1029 | 10.2718 |
3.5513 | 2.8043 | 10000 | 4.2923 | 2.0792 | 10.3929 |
3.5116 | 3.0847 | 11000 | 4.2898 | 1.8706 | 10.3741 |
3.4962 | 3.3651 | 12000 | 4.2901 | 2.107 | 10.4306 |
3.5444 | 3.6455 | 13000 | 4.2911 | 2.0825 | 10.9212 |
3.4893 | 3.9260 | 14000 | 4.2871 | 2.1052 | 10.2388 |
3.3988 | 4.2064 | 15000 | 4.2871 | 2.1329 | 10.2576 |
3.4946 | 4.4868 | 16000 | 4.2873 | 2.1086 | 10.8788 |
3.4212 | 4.7672 | 17000 | 4.2871 | 2.0519 | 11.0012 |
3.4958 | 5.0477 | 18000 | 4.2865 | 2.0286 | 10.8812 |
3.3869 | 5.3281 | 19000 | 4.2876 | 2.046 | 10.4082 |
3.5321 | 5.6085 | 20000 | 4.2874 | 2.1578 | 10.4035 |
3.4374 | 5.8890 | 21000 | 4.2874 | 2.0745 | 10.9247 |
3.5439 | 6.1694 | 22000 | 4.2880 | 2.0663 | 10.3671 |
3.421 | 6.4498 | 23000 | 4.2870 | 2.1364 | 10.8282 |
3.547 | 6.7302 | 24000 | 4.2872 | 2.1323 | 10.8835 |
3.5297 | 7.0107 | 25000 | 4.2877 | 2.119 | 10.9729 |
3.3617 | 7.2911 | 26000 | 4.2880 | 2.1283 | 10.4388 |
3.511 | 7.5715 | 27000 | 4.2873 | 2.1401 | 10.2506 |
3.3947 | 7.8519 | 28000 | 4.2863 | 2.1352 | 10.7718 |
3.4888 | 8.1324 | 29000 | 4.2877 | 2.1507 | 10.8153 |
3.4712 | 8.4128 | 30000 | 4.2877 | 2.1401 | 10.1859 |
3.3557 | 8.6932 | 31000 | 4.2873 | 2.0575 | 11.2671 |
3.5038 | 8.9736 | 32000 | 4.2879 | 2.1183 | 10.4471 |
3.4788 | 9.2541 | 33000 | 4.2875 | 2.1325 | 11.4282 |
3.5303 | 9.5345 | 34000 | 4.2878 | 2.1102 | 10.3012 |
3.5182 | 9.8149 | 35000 | 4.2875 | 2.1295 | 10.2765 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BounharAbdelaziz/Terjman-Nano-v2.1-512
Base model
Helsinki-NLP/opus-mt-en-ar