distilbert-classn-LinearAlg-finetuned-span-width-1
This model is a fine-tuned version of dslim/distilbert-NER on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.7813
- Accuracy: 0.8016
- F1: 0.8033
- Precision: 0.8179
- Recall: 0.8016
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 25
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|---|
5.1371 | 0.6849 | 50 | 2.4649 | 0.0714 | 0.0501 | 0.1218 | 0.0714 |
4.9171 | 1.3699 | 100 | 2.4518 | 0.0794 | 0.0533 | 0.1254 | 0.0794 |
4.8245 | 2.0548 | 150 | 2.3856 | 0.1032 | 0.0914 | 0.2028 | 0.1032 |
4.8193 | 2.7397 | 200 | 2.3362 | 0.1429 | 0.1561 | 0.2496 | 0.1429 |
4.6972 | 3.4247 | 250 | 2.3636 | 0.1032 | 0.1092 | 0.1813 | 0.1032 |
4.615 | 4.1096 | 300 | 2.3017 | 0.2302 | 0.2227 | 0.2808 | 0.2302 |
4.3801 | 4.7945 | 350 | 2.2082 | 0.2937 | 0.2870 | 0.3542 | 0.2937 |
4.209 | 5.4795 | 400 | 2.1003 | 0.3333 | 0.3097 | 0.4304 | 0.3333 |
3.916 | 6.1644 | 450 | 1.9879 | 0.4206 | 0.3987 | 0.4402 | 0.4206 |
3.5137 | 6.8493 | 500 | 1.7583 | 0.5397 | 0.5249 | 0.5419 | 0.5397 |
2.99 | 7.5342 | 550 | 1.5959 | 0.5476 | 0.5150 | 0.5721 | 0.5476 |
2.4576 | 8.2192 | 600 | 1.3742 | 0.6508 | 0.6430 | 0.7217 | 0.6508 |
2.0467 | 8.9041 | 650 | 1.2277 | 0.6825 | 0.6805 | 0.7112 | 0.6825 |
1.6407 | 9.5890 | 700 | 1.0865 | 0.6825 | 0.6764 | 0.7042 | 0.6825 |
1.1023 | 10.2740 | 750 | 0.9734 | 0.7302 | 0.7269 | 0.7686 | 0.7302 |
0.8708 | 10.9589 | 800 | 0.8830 | 0.7619 | 0.7565 | 0.7740 | 0.7619 |
0.7335 | 11.6438 | 850 | 0.8266 | 0.7698 | 0.7707 | 0.7922 | 0.7698 |
0.5333 | 12.3288 | 900 | 0.8078 | 0.7619 | 0.7603 | 0.7723 | 0.7619 |
0.389 | 13.0137 | 950 | 0.7685 | 0.7857 | 0.7874 | 0.8046 | 0.7857 |
0.3018 | 13.6986 | 1000 | 0.7756 | 0.7778 | 0.7829 | 0.8064 | 0.7778 |
0.2219 | 14.3836 | 1050 | 0.7737 | 0.7698 | 0.7667 | 0.7743 | 0.7698 |
0.1865 | 15.0685 | 1100 | 0.7674 | 0.7857 | 0.7846 | 0.7994 | 0.7857 |
0.1429 | 15.7534 | 1150 | 0.7750 | 0.7778 | 0.7796 | 0.7981 | 0.7778 |
0.1038 | 16.4384 | 1200 | 0.7642 | 0.7937 | 0.7964 | 0.8099 | 0.7937 |
0.0881 | 17.1233 | 1250 | 0.7472 | 0.8016 | 0.8051 | 0.8245 | 0.8016 |
0.0946 | 17.8082 | 1300 | 0.7663 | 0.7937 | 0.7974 | 0.8162 | 0.7937 |
0.0501 | 18.4932 | 1350 | 0.7531 | 0.7937 | 0.7928 | 0.8020 | 0.7937 |
0.0421 | 19.1781 | 1400 | 0.7649 | 0.7937 | 0.7951 | 0.8081 | 0.7937 |
0.051 | 19.8630 | 1450 | 0.7695 | 0.8016 | 0.8035 | 0.8164 | 0.8016 |
0.0297 | 20.5479 | 1500 | 0.7799 | 0.8016 | 0.8036 | 0.8212 | 0.8016 |
0.0229 | 21.2329 | 1550 | 0.7649 | 0.8016 | 0.8035 | 0.8160 | 0.8016 |
0.0232 | 21.9178 | 1600 | 0.7810 | 0.7937 | 0.7948 | 0.8117 | 0.7937 |
0.0309 | 22.6027 | 1650 | 0.7813 | 0.8016 | 0.8033 | 0.8179 | 0.8016 |
0.034 | 23.2877 | 1700 | 0.7838 | 0.8016 | 0.8038 | 0.8189 | 0.8016 |
0.0147 | 23.9726 | 1750 | 0.7822 | 0.8016 | 0.8033 | 0.8179 | 0.8016 |
0.0232 | 24.6575 | 1800 | 0.7813 | 0.8016 | 0.8033 | 0.8179 | 0.8016 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for Heather-Driver/distilbert-classn-LinearAlg-finetuned-span-width-1
Base model
distilbert/distilbert-base-cased
Quantized
dslim/distilbert-NER