loose_balanced_cf_seed-63_1e-3
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.1956
- Accuracy: 0.4006
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 64
- seed: 63
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 32000
- num_epochs: 20.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
5.9736 | 0.9994 | 1485 | 4.4156 | 0.2928 |
4.3066 | 1.9995 | 2971 | 3.9060 | 0.3332 |
3.6797 | 2.9996 | 4457 | 3.6355 | 0.3566 |
3.4925 | 3.9997 | 5943 | 3.4738 | 0.3709 |
3.2645 | 4.9997 | 7429 | 3.3761 | 0.3799 |
3.1915 | 5.9998 | 8915 | 3.3184 | 0.3856 |
3.0832 | 6.9999 | 10401 | 3.2787 | 0.3894 |
3.0425 | 8.0 | 11887 | 3.2506 | 0.3924 |
2.9847 | 8.9994 | 13372 | 3.2324 | 0.3942 |
2.9552 | 9.9995 | 14858 | 3.2234 | 0.3955 |
2.9204 | 10.9996 | 16344 | 3.2134 | 0.3966 |
2.8999 | 11.9997 | 17830 | 3.2045 | 0.3980 |
2.8789 | 12.9997 | 19316 | 3.2006 | 0.3985 |
2.8601 | 13.9998 | 20802 | 3.1995 | 0.3984 |
2.8494 | 14.9999 | 22288 | 3.1956 | 0.3990 |
2.8313 | 16.0 | 23774 | 3.1950 | 0.3997 |
2.8307 | 16.9994 | 25259 | 3.1953 | 0.3998 |
2.81 | 17.9995 | 26745 | 3.1945 | 0.4002 |
2.8189 | 18.9996 | 28231 | 3.1877 | 0.4007 |
2.7972 | 19.9882 | 29700 | 3.1956 | 0.4006 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.20.0
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.