nodative_cf_seed-63_1e-3
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.1546
- Accuracy: 0.4039
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 64
- seed: 63
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 32000
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
5.9916 | 0.9998 | 1490 | 4.3949 | 0.2949 |
4.3216 | 1.9997 | 2980 | 3.8825 | 0.3350 |
3.6895 | 2.9995 | 4470 | 3.6027 | 0.3586 |
3.5069 | 4.0 | 5961 | 3.4428 | 0.3740 |
3.2821 | 4.9998 | 7451 | 3.3414 | 0.3834 |
3.2066 | 5.9997 | 8941 | 3.2829 | 0.3888 |
3.0992 | 6.9995 | 10431 | 3.2434 | 0.3929 |
3.0599 | 8.0 | 11922 | 3.2210 | 0.3953 |
2.9981 | 8.9998 | 13412 | 3.1978 | 0.3975 |
2.9771 | 9.9997 | 14902 | 3.1883 | 0.3988 |
2.9355 | 10.9995 | 16392 | 3.1820 | 0.3998 |
2.9223 | 12.0 | 17883 | 3.1726 | 0.4008 |
2.8938 | 12.9998 | 19373 | 3.1647 | 0.4019 |
2.8825 | 13.9997 | 20863 | 3.1646 | 0.4019 |
2.8661 | 14.9995 | 22353 | 3.1594 | 0.4030 |
2.8555 | 16.0 | 23844 | 3.1606 | 0.4030 |
2.8487 | 16.9998 | 25334 | 3.1576 | 0.4033 |
2.8342 | 17.9997 | 26824 | 3.1580 | 0.4036 |
2.8345 | 18.9995 | 28314 | 3.1599 | 0.4034 |
2.8234 | 19.9966 | 29800 | 3.1546 | 0.4039 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.20.0
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.