bengali_qa_model_AGGRO_banglabert
This model is a fine-tuned version of csebuetnlp/banglabert on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2676
- Exact Match: 98.5714
- F1 Score: 99.0056
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 3407
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Exact Match | F1 Score |
---|---|---|---|---|---|
6.0126 | 0.0053 | 1 | 5.9783 | 0.0 | 0.6103 |
6.0125 | 0.0107 | 2 | 5.9540 | 0.0 | 0.7848 |
5.9675 | 0.0160 | 3 | 5.9074 | 0.0 | 0.9597 |
5.9287 | 0.0214 | 4 | 5.8425 | 0.0 | 1.7507 |
5.8586 | 0.0267 | 5 | 5.7636 | 0.1504 | 4.2535 |
5.8206 | 0.0321 | 6 | 5.6740 | 0.4511 | 11.2628 |
5.7246 | 0.0374 | 7 | 5.5749 | 1.7293 | 23.3816 |
5.634 | 0.0428 | 8 | 5.4574 | 3.9850 | 37.7873 |
5.4963 | 0.0481 | 9 | 5.3105 | 5.7895 | 47.4987 |
5.2985 | 0.0535 | 10 | 5.1265 | 7.5940 | 52.0471 |
5.182 | 0.0588 | 11 | 4.8997 | 11.9549 | 54.5555 |
4.973 | 0.0641 | 12 | 4.6631 | 15.9398 | 56.5530 |
4.8353 | 0.0695 | 13 | 4.4348 | 19.6241 | 58.6313 |
4.6269 | 0.0748 | 14 | 4.2322 | 23.5338 | 60.7029 |
4.4238 | 0.0802 | 15 | 4.0467 | 28.0451 | 62.7494 |
4.1976 | 0.0855 | 16 | 3.8781 | 32.7068 | 64.6375 |
4.1302 | 0.0909 | 17 | 3.7200 | 35.7895 | 66.2513 |
3.9139 | 0.0962 | 18 | 3.5621 | 39.5489 | 67.7758 |
3.8521 | 0.1016 | 19 | 3.4019 | 43.0075 | 69.2899 |
3.7003 | 0.1069 | 20 | 3.2534 | 46.0150 | 70.6373 |
3.5972 | 0.1123 | 21 | 3.1168 | 48.8722 | 72.3043 |
3.5249 | 0.1176 | 22 | 2.9875 | 51.5038 | 73.2903 |
3.1756 | 0.1230 | 23 | 2.8600 | 53.6090 | 74.1609 |
3.2323 | 0.1283 | 24 | 2.7356 | 55.2632 | 74.8864 |
3.0696 | 0.1336 | 25 | 2.6150 | 56.8421 | 75.8938 |
2.9806 | 0.1390 | 26 | 2.5029 | 58.9474 | 77.3831 |
2.8261 | 0.1443 | 27 | 2.3997 | 61.1278 | 78.8467 |
2.8965 | 0.1497 | 28 | 2.3045 | 63.9098 | 80.8890 |
2.6622 | 0.1550 | 29 | 2.2151 | 66.0902 | 82.4263 |
2.5132 | 0.1604 | 30 | 2.1300 | 68.0451 | 83.8984 |
2.5076 | 0.1657 | 31 | 2.0482 | 70.9774 | 85.4846 |
2.2189 | 0.1711 | 32 | 1.9678 | 72.7068 | 86.2628 |
2.0851 | 0.1764 | 33 | 1.8883 | 75.8647 | 87.8992 |
2.1198 | 0.1818 | 34 | 1.8091 | 78.5714 | 89.4148 |
2.0272 | 0.1871 | 35 | 1.7300 | 80.8271 | 90.3877 |
1.9951 | 0.1924 | 36 | 1.6514 | 82.7068 | 91.2138 |
1.7741 | 0.1978 | 37 | 1.5736 | 84.9624 | 91.8920 |
1.9176 | 0.2031 | 38 | 1.4970 | 86.5414 | 92.3250 |
1.8599 | 0.2085 | 39 | 1.4219 | 87.5940 | 92.7578 |
1.8095 | 0.2138 | 40 | 1.3496 | 88.5714 | 93.0980 |
1.7814 | 0.2192 | 41 | 1.2790 | 90.0752 | 93.7737 |
1.4602 | 0.2245 | 42 | 1.2103 | 91.5038 | 94.6447 |
1.5147 | 0.2299 | 43 | 1.1431 | 92.1805 | 95.1039 |
1.4205 | 0.2352 | 44 | 1.0774 | 92.9323 | 95.4111 |
1.3222 | 0.2406 | 45 | 1.0127 | 93.9850 | 96.0199 |
1.2477 | 0.2459 | 46 | 0.9508 | 94.8120 | 96.5219 |
1.1406 | 0.2513 | 47 | 0.8936 | 95.2632 | 96.8391 |
1.1698 | 0.2566 | 48 | 0.8382 | 96.3158 | 97.5331 |
1.1359 | 0.2619 | 49 | 0.7847 | 97.0677 | 97.9841 |
1.1811 | 0.2673 | 50 | 0.7324 | 97.5940 | 98.4006 |
0.9734 | 0.2726 | 51 | 0.6814 | 97.7444 | 98.5321 |
0.928 | 0.2780 | 52 | 0.6318 | 97.8947 | 98.6140 |
0.8989 | 0.2833 | 53 | 0.5859 | 98.1203 | 98.7571 |
0.7784 | 0.2887 | 54 | 0.5430 | 98.3459 | 98.9243 |
1.0015 | 0.2940 | 55 | 0.5027 | 98.3459 | 98.8914 |
0.7509 | 0.2994 | 56 | 0.4656 | 98.5714 | 99.0811 |
0.6838 | 0.3047 | 57 | 0.4328 | 98.7970 | 99.1723 |
0.7336 | 0.3101 | 58 | 0.4042 | 98.8722 | 99.1327 |
0.5729 | 0.3154 | 59 | 0.3781 | 98.9474 | 99.2079 |
0.5891 | 0.3207 | 60 | 0.3538 | 99.0226 | 99.3362 |
0.6168 | 0.3261 | 61 | 0.3322 | 99.1729 | 99.4169 |
0.5503 | 0.3314 | 62 | 0.3130 | 99.1729 | 99.4169 |
0.5058 | 0.3368 | 63 | 0.2955 | 99.1729 | 99.4169 |
0.4065 | 0.3421 | 64 | 0.2788 | 99.3233 | 99.5000 |
0.4466 | 0.3475 | 65 | 0.2638 | 99.2481 | 99.4981 |
0.4727 | 0.3528 | 66 | 0.2496 | 99.2481 | 99.4981 |
0.45 | 0.3582 | 67 | 0.2365 | 99.2481 | 99.4981 |
Framework versions
- Transformers 4.46.3
- Pytorch 2.4.0
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 41
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_banglabert
Base model
csebuetnlp/banglabert