Llama-3.2-400M-Amharic-Poems-Stories-V5

This model is a fine-tuned version of rasyosef/Llama-3.2-400M-Amharic on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2025
  • Model Preparation Time: 0.003

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • lr_scheduler_warmup_steps: 250
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
3.0159 0.1219 500 2.4049 0.003
2.0028 0.2437 1000 1.6978 0.003
1.5634 0.3656 1500 1.4614 0.003
1.4024 0.4874 2000 1.3237 0.003
1.2445 0.6093 2500 1.2619 0.003
1.3169 0.7312 3000 1.2383 0.003
1.2045 0.8530 3500 1.2104 0.003
1.2074 0.9749 4000 1.2025 0.003
0.715 1.0968 4500 1.3746 0.003
0.5526 1.2186 5000 1.3882 0.003
0.5383 1.3405 5500 1.3509 0.003
0.5541 1.4623 6000 1.3664 0.003
0.561 1.5842 6500 1.3486 0.003
0.5278 1.7061 7000 1.3447 0.003
0.5415 1.8279 7500 1.3282 0.003
0.5491 1.9498 8000 1.3404 0.003
0.3309 2.0717 8500 1.5191 0.003
0.1989 2.1935 9000 1.5328 0.003
0.2018 2.3154 9500 1.5266 0.003
0.1954 2.4372 10000 1.5309 0.003
0.1957 2.5591 10500 1.5363 0.003
0.1983 2.6810 11000 1.5228 0.003
0.1962 2.8028 11500 1.5295 0.003
0.1934 2.9247 12000 1.5303 0.003
0.1566 3.0466 12500 1.5726 0.003
0.1045 3.1684 13000 1.5975 0.003
0.1071 3.2903 13500 1.5947 0.003
0.1076 3.4121 14000 1.5920 0.003
0.1068 3.5340 14500 1.5922 0.003
0.1052 3.6559 15000 1.5921 0.003
0.1074 3.7777 15500 1.5922 0.003
0.1084 3.8996 16000 1.5923 0.003

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.4.1+cu121
  • Datasets 3.3.2
  • Tokenizers 0.20.3
Downloads last month
105
Safetensors
Model size
413M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for yosefw/Llama-3.2-400M-Amharic-Poems-Stories-V5

Finetuned
(9)
this model