ModernBERT-base_massive_modern_base_crf_v1

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 19.7145
  • Slot P: 0.5406
  • Slot R: 0.6025
  • Slot F1: 0.5699
  • Slot Exact Match: 0.5848
  • Intent Acc: 0.7496

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Slot P Slot R Slot F1 Slot Exact Match Intent Acc
No log 1.0 45 43.1123 0.0 0.0 0.0 0.3187 0.0576
179.6545 2.0 90 30.0614 0.1686 0.0716 0.1006 0.3074 0.1672
72.1388 3.0 135 21.0393 0.2894 0.2493 0.2678 0.3369 0.3468
44.9919 4.0 180 16.7777 0.3668 0.3617 0.3642 0.4004 0.4801
29.7573 5.0 225 14.9362 0.4282 0.4408 0.4344 0.4584 0.5430
20.8656 6.0 270 13.8422 0.4479 0.5 0.4725 0.5007 0.6173
13.791 7.0 315 14.2036 0.4469 0.4881 0.4666 0.5042 0.6557
9.486 8.0 360 14.5381 0.4677 0.5159 0.4907 0.5219 0.6763
6.4651 9.0 405 14.5455 0.4717 0.5552 0.5101 0.5421 0.6926
4.6616 10.0 450 15.0882 0.5020 0.5672 0.5326 0.5637 0.7177
4.6616 11.0 495 14.6622 0.4818 0.5861 0.5288 0.5489 0.7231
3.4254 12.0 540 15.3736 0.4835 0.6050 0.5375 0.5622 0.7241
2.4577 13.0 585 15.5837 0.4923 0.6030 0.5420 0.5652 0.7334
1.977 14.0 630 16.2094 0.5082 0.5856 0.5442 0.5667 0.7368
1.4042 15.0 675 16.3909 0.4922 0.5960 0.5392 0.5588 0.7314
1.0461 16.0 720 16.8022 0.5050 0.6005 0.5486 0.5730 0.7359
0.7993 17.0 765 16.9952 0.5128 0.6080 0.5563 0.5735 0.7398
0.5947 18.0 810 17.6901 0.5205 0.5990 0.5570 0.5681 0.7447
0.457 19.0 855 18.4398 0.5231 0.5856 0.5526 0.5691 0.7462
0.3941 20.0 900 18.8391 0.5340 0.6010 0.5655 0.5834 0.7427
0.3941 21.0 945 19.8849 0.5409 0.5886 0.5637 0.5848 0.7486
0.2892 22.0 990 18.9729 0.5356 0.6134 0.5719 0.5794 0.7550
0.2669 23.0 1035 19.5638 0.5404 0.5955 0.5666 0.5829 0.7531
0.2357 24.0 1080 19.5290 0.5362 0.6045 0.5683 0.5824 0.7501
0.2012 25.0 1125 19.7145 0.5406 0.6025 0.5699 0.5848 0.7496

Framework versions

  • Transformers 4.55.0
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
149M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aiface/ModernBERT-base_massive_modern_base_crf_v1

Finetuned
(708)
this model