llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the centime dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0070

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 48
  • eval_batch_size: 48
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25.0

Training results

Training Loss Epoch Step Validation Loss
0.1159 0.1208 25 0.1004
0.0843 0.2415 50 0.0635
0.0763 0.3623 75 0.0474
0.0496 0.4831 100 0.0365
0.046 0.6039 125 0.0316
0.0368 0.7246 150 0.0266
0.0283 0.8454 175 0.0232
0.0237 0.9662 200 0.0212
0.0234 1.0870 225 0.0194
0.0232 1.2077 250 0.0176
0.0307 1.3285 275 0.0178
0.0228 1.4493 300 0.0147
0.0167 1.5700 325 0.0155
0.0238 1.6908 350 0.0125
0.0191 1.8116 375 0.0138
0.0273 1.9324 400 0.0120
0.0194 2.0531 425 0.0125
0.0125 2.1739 450 0.0128
0.0132 2.2947 475 0.0117
0.0142 2.4155 500 0.0099
0.0119 2.5362 525 0.0105
0.0131 2.6570 550 0.0118
0.0089 2.7778 575 0.0100
0.0158 2.8986 600 0.0096
0.0119 3.0193 625 0.0096
0.0097 3.1401 650 0.0099
0.0089 3.2609 675 0.0092
0.0087 3.3816 700 0.0088
0.0083 3.5024 725 0.0088
0.0088 3.6232 750 0.0080
0.0058 3.7440 775 0.0069
0.008 3.8647 800 0.0070
0.0099 3.9855 825 0.0073
0.0072 4.1063 850 0.0113
0.0065 4.2271 875 0.0107
0.0079 4.3478 900 0.0097
0.0081 4.4686 925 0.0103

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
13
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for sizhkhy/centime

Adapter
(339)
this model