ViLegalBERT-v0

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4575
  • Accuracy: 0.8898

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.7791 0.0483 1000 0.7469 0.8340
0.7006 0.0965 2000 0.7024 0.8423
0.6624 0.1448 3000 0.6724 0.8477
0.6255 0.1930 4000 0.6500 0.8523
0.6027 0.2413 5000 0.6289 0.8560
0.5836 0.2895 6000 0.6169 0.8583
0.5743 0.3378 7000 0.6033 0.8609
0.5613 0.3860 8000 0.5947 0.8626
0.5486 0.4343 9000 0.5874 0.8642
0.5447 0.4825 10000 0.5778 0.8657
0.5389 0.5308 11000 0.5710 0.8668
0.5295 0.5791 12000 0.5650 0.8688
0.5188 0.6273 13000 0.5551 0.8704
0.5102 0.6756 14000 0.5491 0.8715
0.5096 0.7238 15000 0.5462 0.8723
0.5052 0.7721 16000 0.5386 0.8736
0.4981 0.8203 17000 0.5339 0.8747
0.491 0.8686 18000 0.5272 0.8757
0.4894 0.9168 19000 0.5243 0.8764
0.4853 0.9651 20000 0.5232 0.8768
0.4812 1.0133 21000 0.5152 0.8779
0.4732 1.0616 22000 0.5143 0.8789
0.474 1.1098 23000 0.5101 0.8791
0.4701 1.1581 24000 0.5060 0.8803
0.4678 1.2063 25000 0.5025 0.8806
0.4661 1.2546 26000 0.5003 0.8811
0.464 1.3028 27000 0.4949 0.8822
0.461 1.3511 28000 0.4929 0.8825
0.4574 1.3994 29000 0.4916 0.8829
0.4598 1.4476 30000 0.4896 0.8834
0.4549 1.4959 31000 0.4878 0.8839
0.4523 1.5441 32000 0.4849 0.8846
0.4482 1.5924 33000 0.4820 0.8850
0.4477 1.6406 34000 0.4802 0.8854
0.4467 1.6889 35000 0.4789 0.8853
0.4434 1.7371 36000 0.4765 0.8862
0.443 1.7854 37000 0.4752 0.8865
0.4417 1.8336 38000 0.4741 0.8865
0.4393 1.8819 39000 0.4713 0.8870
0.4362 1.9302 40000 0.4708 0.8874
0.4356 1.9784 41000 0.4687 0.8877
0.4343 2.0266 42000 0.4677 0.8880
0.4333 2.0749 43000 0.4638 0.8888
0.4319 2.1231 44000 0.4645 0.8889
0.4363 2.1714 45000 0.4633 0.8886
0.4281 2.2197 46000 0.4602 0.8892
0.4242 2.2679 47000 0.4609 0.8895
0.4262 2.3162 48000 0.4576 0.8898
0.4231 2.3644 49000 0.4554 0.8904
0.4197 2.4127 50000 0.4562 0.8903
0.4231 2.4609 51000 0.4556 0.8902
0.422 2.5092 52000 0.4522 0.8909
0.4222 2.5574 53000 0.4526 0.8906
0.4208 2.6057 54000 0.4497 0.8916
0.42 2.6539 55000 0.4510 0.8914
0.4218 2.7022 56000 0.4492 0.8920
0.4162 2.7505 57000 0.4479 0.8922
0.4168 2.7987 58000 0.4466 0.8922
0.418 2.8470 59000 0.4466 0.8921
0.4164 2.8952 60000 0.4447 0.8928
0.4133 2.9435 61000 0.4437 0.8929
0.4103 2.9917 62000 0.4418 0.8932
0.4106 3.0400 63000 0.4397 0.8939
0.4122 3.0882 64000 0.4392 0.8938
0.4082 3.1365 65000 0.4380 0.8942
0.4069 3.1847 66000 0.4379 0.8942
0.4076 3.2330 67000 0.4369 0.8944
0.4079 3.2812 68000 0.4355 0.8946
0.4045 3.3295 69000 0.4351 0.8946
0.4032 3.3777 70000 0.4350 0.8950
0.4043 3.4260 71000 0.4329 0.8952
0.4018 3.4742 72000 0.4319 0.8952
0.4028 3.5225 73000 0.4324 0.8954
0.4017 3.5708 74000 0.4303 0.8956
0.4013 3.6190 75000 0.4312 0.8957
0.4003 3.6673 76000 0.4290 0.8959
0.397 3.7155 77000 0.4289 0.8958
0.3978 3.7638 78000 0.4280 0.8962
0.3985 3.8120 79000 0.4266 0.8967
0.3972 3.8603 80000 0.4257 0.8966
0.3926 3.9085 81000 0.4241 0.8969
0.397 3.9568 82000 0.4244 0.8968
0.3963 4.0050 83000 0.4239 0.8971
0.3952 4.0533 84000 0.4234 0.8970
0.3916 4.1015 85000 0.4217 0.8973
0.3938 4.1498 86000 0.4190 0.8979
0.3926 4.1980 87000 0.4188 0.8981
0.3924 4.2463 88000 0.4198 0.8978
0.3895 4.2945 89000 0.4195 0.8978
0.3918 4.3428 90000 0.4186 0.8982
0.3906 4.3911 91000 0.4182 0.8983
0.3927 4.4393 92000 0.4178 0.8983
0.3902 4.4876 93000 0.4170 0.8986
0.3924 4.5358 94000 0.4179 0.8982
0.3879 4.5841 95000 0.4144 0.8989
0.3897 4.6323 96000 0.4150 0.8989
0.3903 4.6806 97000 0.4158 0.8989
0.3893 4.7288 98000 0.4190 0.8983
0.5424 4.7771 99000 0.6255 0.8573
0.4768 4.8253 100000 0.4770 0.8850
0.4418 4.8736 101000 0.4506 0.8913
0.4562 4.9219 102000 0.4713 0.8871
0.4494 4.9701 103000 0.4575 0.8898

Framework versions

  • Transformers 4.56.0.dev0
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
104
Safetensors
Model size
278M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support