distilbert-classn-LinearAlg-finetuned-span-width-5
This model is a fine-tuned version of dslim/distilbert-NER on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.4237
- Accuracy: 0.5476
- F1: 0.5578
- Precision: 0.5802
- Recall: 0.5476
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 25
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|---|
5.0508 | 0.6849 | 50 | 2.5065 | 0.0873 | 0.0655 | 0.0552 | 0.0873 |
5.0173 | 1.3699 | 100 | 2.4891 | 0.0873 | 0.0671 | 0.0598 | 0.0873 |
4.9755 | 2.0548 | 150 | 2.4689 | 0.0873 | 0.0594 | 0.0468 | 0.0873 |
4.9346 | 2.7397 | 200 | 2.4680 | 0.0873 | 0.0693 | 0.0656 | 0.0873 |
4.8569 | 3.4247 | 250 | 2.4446 | 0.0794 | 0.0691 | 0.0803 | 0.0794 |
4.7807 | 4.1096 | 300 | 2.4166 | 0.0873 | 0.0855 | 0.1281 | 0.0873 |
4.6991 | 4.7945 | 350 | 2.4031 | 0.0952 | 0.0940 | 0.1022 | 0.0952 |
4.4566 | 5.4795 | 400 | 2.3835 | 0.1508 | 0.1553 | 0.1795 | 0.1508 |
4.3468 | 6.1644 | 450 | 2.3874 | 0.1587 | 0.1524 | 0.1839 | 0.1587 |
4.2184 | 6.8493 | 500 | 2.3550 | 0.2063 | 0.2005 | 0.2522 | 0.2063 |
4.0597 | 7.5342 | 550 | 2.3082 | 0.2302 | 0.2093 | 0.2381 | 0.2302 |
3.8134 | 8.2192 | 600 | 2.2897 | 0.2302 | 0.2155 | 0.2873 | 0.2302 |
3.693 | 8.9041 | 650 | 2.2374 | 0.2540 | 0.2503 | 0.3564 | 0.2540 |
3.2232 | 9.5890 | 700 | 2.1660 | 0.2619 | 0.2491 | 0.3113 | 0.2619 |
3.0715 | 10.2740 | 750 | 2.0890 | 0.3175 | 0.3068 | 0.3467 | 0.3175 |
2.5457 | 10.9589 | 800 | 2.0022 | 0.3571 | 0.3422 | 0.3925 | 0.3571 |
2.2566 | 11.6438 | 850 | 1.9322 | 0.3413 | 0.3360 | 0.3502 | 0.3413 |
1.8691 | 12.3288 | 900 | 1.8635 | 0.3810 | 0.3572 | 0.3526 | 0.3810 |
1.6444 | 13.0137 | 950 | 1.7990 | 0.3889 | 0.3924 | 0.4215 | 0.3889 |
1.3832 | 13.6986 | 1000 | 1.7589 | 0.4524 | 0.4482 | 0.4583 | 0.4524 |
1.1667 | 14.3836 | 1050 | 1.7023 | 0.4365 | 0.4302 | 0.4431 | 0.4365 |
0.974 | 15.0685 | 1100 | 1.6077 | 0.4921 | 0.4881 | 0.4892 | 0.4921 |
0.8558 | 15.7534 | 1150 | 1.5825 | 0.4683 | 0.4645 | 0.4726 | 0.4683 |
0.7572 | 16.4384 | 1200 | 1.5705 | 0.4841 | 0.4796 | 0.4857 | 0.4841 |
0.6022 | 17.1233 | 1250 | 1.5357 | 0.5 | 0.4997 | 0.5102 | 0.5 |
0.5079 | 17.8082 | 1300 | 1.4927 | 0.5238 | 0.5295 | 0.5617 | 0.5238 |
0.4526 | 18.4932 | 1350 | 1.5055 | 0.4921 | 0.4949 | 0.5086 | 0.4921 |
0.4512 | 19.1781 | 1400 | 1.4643 | 0.5238 | 0.5270 | 0.5517 | 0.5238 |
0.3474 | 19.8630 | 1450 | 1.4326 | 0.5317 | 0.5402 | 0.5594 | 0.5317 |
0.2731 | 20.5479 | 1500 | 1.4435 | 0.5238 | 0.5310 | 0.5507 | 0.5238 |
0.2605 | 21.2329 | 1550 | 1.4289 | 0.5159 | 0.5249 | 0.5471 | 0.5159 |
0.235 | 21.9178 | 1600 | 1.4168 | 0.5556 | 0.5609 | 0.5747 | 0.5556 |
0.2219 | 22.6027 | 1650 | 1.4265 | 0.5317 | 0.5390 | 0.5591 | 0.5317 |
0.2373 | 23.2877 | 1700 | 1.4257 | 0.5476 | 0.5540 | 0.5734 | 0.5476 |
0.1764 | 23.9726 | 1750 | 1.4278 | 0.5397 | 0.5479 | 0.5698 | 0.5397 |
0.1967 | 24.6575 | 1800 | 1.4237 | 0.5476 | 0.5578 | 0.5802 | 0.5476 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for Heather-Driver/distilbert-classn-LinearAlg-finetuned-span-width-5
Base model
distilbert/distilbert-base-cased
Quantized
dslim/distilbert-NER