loose_balanced_cf_seed-63_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1956
  • Accuracy: 0.4006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 63
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.9736 0.9994 1485 4.4156 0.2928
4.3066 1.9995 2971 3.9060 0.3332
3.6797 2.9996 4457 3.6355 0.3566
3.4925 3.9997 5943 3.4738 0.3709
3.2645 4.9997 7429 3.3761 0.3799
3.1915 5.9998 8915 3.3184 0.3856
3.0832 6.9999 10401 3.2787 0.3894
3.0425 8.0 11887 3.2506 0.3924
2.9847 8.9994 13372 3.2324 0.3942
2.9552 9.9995 14858 3.2234 0.3955
2.9204 10.9996 16344 3.2134 0.3966
2.8999 11.9997 17830 3.2045 0.3980
2.8789 12.9997 19316 3.2006 0.3985
2.8601 13.9998 20802 3.1995 0.3984
2.8494 14.9999 22288 3.1956 0.3990
2.8313 16.0 23774 3.1950 0.3997
2.8307 16.9994 25259 3.1953 0.3998
2.81 17.9995 26745 3.1945 0.4002
2.8189 18.9996 28231 3.1877 0.4007
2.7972 19.9882 29700 3.1956 0.4006

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.20.0
Downloads last month
12
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.