ViLegalBERT-v0
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4575
- Accuracy: 0.8898
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.7791 | 0.0483 | 1000 | 0.7469 | 0.8340 |
0.7006 | 0.0965 | 2000 | 0.7024 | 0.8423 |
0.6624 | 0.1448 | 3000 | 0.6724 | 0.8477 |
0.6255 | 0.1930 | 4000 | 0.6500 | 0.8523 |
0.6027 | 0.2413 | 5000 | 0.6289 | 0.8560 |
0.5836 | 0.2895 | 6000 | 0.6169 | 0.8583 |
0.5743 | 0.3378 | 7000 | 0.6033 | 0.8609 |
0.5613 | 0.3860 | 8000 | 0.5947 | 0.8626 |
0.5486 | 0.4343 | 9000 | 0.5874 | 0.8642 |
0.5447 | 0.4825 | 10000 | 0.5778 | 0.8657 |
0.5389 | 0.5308 | 11000 | 0.5710 | 0.8668 |
0.5295 | 0.5791 | 12000 | 0.5650 | 0.8688 |
0.5188 | 0.6273 | 13000 | 0.5551 | 0.8704 |
0.5102 | 0.6756 | 14000 | 0.5491 | 0.8715 |
0.5096 | 0.7238 | 15000 | 0.5462 | 0.8723 |
0.5052 | 0.7721 | 16000 | 0.5386 | 0.8736 |
0.4981 | 0.8203 | 17000 | 0.5339 | 0.8747 |
0.491 | 0.8686 | 18000 | 0.5272 | 0.8757 |
0.4894 | 0.9168 | 19000 | 0.5243 | 0.8764 |
0.4853 | 0.9651 | 20000 | 0.5232 | 0.8768 |
0.4812 | 1.0133 | 21000 | 0.5152 | 0.8779 |
0.4732 | 1.0616 | 22000 | 0.5143 | 0.8789 |
0.474 | 1.1098 | 23000 | 0.5101 | 0.8791 |
0.4701 | 1.1581 | 24000 | 0.5060 | 0.8803 |
0.4678 | 1.2063 | 25000 | 0.5025 | 0.8806 |
0.4661 | 1.2546 | 26000 | 0.5003 | 0.8811 |
0.464 | 1.3028 | 27000 | 0.4949 | 0.8822 |
0.461 | 1.3511 | 28000 | 0.4929 | 0.8825 |
0.4574 | 1.3994 | 29000 | 0.4916 | 0.8829 |
0.4598 | 1.4476 | 30000 | 0.4896 | 0.8834 |
0.4549 | 1.4959 | 31000 | 0.4878 | 0.8839 |
0.4523 | 1.5441 | 32000 | 0.4849 | 0.8846 |
0.4482 | 1.5924 | 33000 | 0.4820 | 0.8850 |
0.4477 | 1.6406 | 34000 | 0.4802 | 0.8854 |
0.4467 | 1.6889 | 35000 | 0.4789 | 0.8853 |
0.4434 | 1.7371 | 36000 | 0.4765 | 0.8862 |
0.443 | 1.7854 | 37000 | 0.4752 | 0.8865 |
0.4417 | 1.8336 | 38000 | 0.4741 | 0.8865 |
0.4393 | 1.8819 | 39000 | 0.4713 | 0.8870 |
0.4362 | 1.9302 | 40000 | 0.4708 | 0.8874 |
0.4356 | 1.9784 | 41000 | 0.4687 | 0.8877 |
0.4343 | 2.0266 | 42000 | 0.4677 | 0.8880 |
0.4333 | 2.0749 | 43000 | 0.4638 | 0.8888 |
0.4319 | 2.1231 | 44000 | 0.4645 | 0.8889 |
0.4363 | 2.1714 | 45000 | 0.4633 | 0.8886 |
0.4281 | 2.2197 | 46000 | 0.4602 | 0.8892 |
0.4242 | 2.2679 | 47000 | 0.4609 | 0.8895 |
0.4262 | 2.3162 | 48000 | 0.4576 | 0.8898 |
0.4231 | 2.3644 | 49000 | 0.4554 | 0.8904 |
0.4197 | 2.4127 | 50000 | 0.4562 | 0.8903 |
0.4231 | 2.4609 | 51000 | 0.4556 | 0.8902 |
0.422 | 2.5092 | 52000 | 0.4522 | 0.8909 |
0.4222 | 2.5574 | 53000 | 0.4526 | 0.8906 |
0.4208 | 2.6057 | 54000 | 0.4497 | 0.8916 |
0.42 | 2.6539 | 55000 | 0.4510 | 0.8914 |
0.4218 | 2.7022 | 56000 | 0.4492 | 0.8920 |
0.4162 | 2.7505 | 57000 | 0.4479 | 0.8922 |
0.4168 | 2.7987 | 58000 | 0.4466 | 0.8922 |
0.418 | 2.8470 | 59000 | 0.4466 | 0.8921 |
0.4164 | 2.8952 | 60000 | 0.4447 | 0.8928 |
0.4133 | 2.9435 | 61000 | 0.4437 | 0.8929 |
0.4103 | 2.9917 | 62000 | 0.4418 | 0.8932 |
0.4106 | 3.0400 | 63000 | 0.4397 | 0.8939 |
0.4122 | 3.0882 | 64000 | 0.4392 | 0.8938 |
0.4082 | 3.1365 | 65000 | 0.4380 | 0.8942 |
0.4069 | 3.1847 | 66000 | 0.4379 | 0.8942 |
0.4076 | 3.2330 | 67000 | 0.4369 | 0.8944 |
0.4079 | 3.2812 | 68000 | 0.4355 | 0.8946 |
0.4045 | 3.3295 | 69000 | 0.4351 | 0.8946 |
0.4032 | 3.3777 | 70000 | 0.4350 | 0.8950 |
0.4043 | 3.4260 | 71000 | 0.4329 | 0.8952 |
0.4018 | 3.4742 | 72000 | 0.4319 | 0.8952 |
0.4028 | 3.5225 | 73000 | 0.4324 | 0.8954 |
0.4017 | 3.5708 | 74000 | 0.4303 | 0.8956 |
0.4013 | 3.6190 | 75000 | 0.4312 | 0.8957 |
0.4003 | 3.6673 | 76000 | 0.4290 | 0.8959 |
0.397 | 3.7155 | 77000 | 0.4289 | 0.8958 |
0.3978 | 3.7638 | 78000 | 0.4280 | 0.8962 |
0.3985 | 3.8120 | 79000 | 0.4266 | 0.8967 |
0.3972 | 3.8603 | 80000 | 0.4257 | 0.8966 |
0.3926 | 3.9085 | 81000 | 0.4241 | 0.8969 |
0.397 | 3.9568 | 82000 | 0.4244 | 0.8968 |
0.3963 | 4.0050 | 83000 | 0.4239 | 0.8971 |
0.3952 | 4.0533 | 84000 | 0.4234 | 0.8970 |
0.3916 | 4.1015 | 85000 | 0.4217 | 0.8973 |
0.3938 | 4.1498 | 86000 | 0.4190 | 0.8979 |
0.3926 | 4.1980 | 87000 | 0.4188 | 0.8981 |
0.3924 | 4.2463 | 88000 | 0.4198 | 0.8978 |
0.3895 | 4.2945 | 89000 | 0.4195 | 0.8978 |
0.3918 | 4.3428 | 90000 | 0.4186 | 0.8982 |
0.3906 | 4.3911 | 91000 | 0.4182 | 0.8983 |
0.3927 | 4.4393 | 92000 | 0.4178 | 0.8983 |
0.3902 | 4.4876 | 93000 | 0.4170 | 0.8986 |
0.3924 | 4.5358 | 94000 | 0.4179 | 0.8982 |
0.3879 | 4.5841 | 95000 | 0.4144 | 0.8989 |
0.3897 | 4.6323 | 96000 | 0.4150 | 0.8989 |
0.3903 | 4.6806 | 97000 | 0.4158 | 0.8989 |
0.3893 | 4.7288 | 98000 | 0.4190 | 0.8983 |
0.5424 | 4.7771 | 99000 | 0.6255 | 0.8573 |
0.4768 | 4.8253 | 100000 | 0.4770 | 0.8850 |
0.4418 | 4.8736 | 101000 | 0.4506 | 0.8913 |
0.4562 | 4.9219 | 102000 | 0.4713 | 0.8871 |
0.4494 | 4.9701 | 103000 | 0.4575 | 0.8898 |
Framework versions
- Transformers 4.56.0.dev0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.21.4
- Downloads last month
- 104