ModernBERT-large_massive_modernbert_large_crf_v1

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 15.5718
  • Slot P: 0.5398
  • Slot R: 0.6408
  • Slot F1: 0.5860
  • Slot Exact Match: 0.6001
  • Intent Acc: 0.7831

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Slot P Slot R Slot F1 Slot Exact Match Intent Acc
No log 1.0 45 43.3614 0.0 0.0 0.0 0.3178 0.0821
160.2669 2.0 90 27.2292 0.3143 0.2269 0.2635 0.3586 0.2548
66.654 3.0 135 19.2474 0.4379 0.4 0.4181 0.4481 0.4575
38.629 4.0 180 15.3625 0.4023 0.5408 0.4614 0.4801 0.5903
23.3498 5.0 225 12.4194 0.4446 0.5706 0.4998 0.5411 0.6695
12.7922 6.0 270 12.3227 0.5013 0.5980 0.5454 0.5691 0.6990
7.8613 7.0 315 12.8060 0.4926 0.6 0.5410 0.5642 0.7324
5.4037 8.0 360 12.9247 0.5086 0.6294 0.5626 0.5809 0.7388
3.6892 9.0 405 13.9871 0.5260 0.6343 0.5751 0.5986 0.7605
2.6797 10.0 450 14.0965 0.5562 0.6204 0.5865 0.6011 0.7742
2.6797 11.0 495 13.8520 0.5105 0.6398 0.5679 0.5775 0.7698
2.0031 12.0 540 15.0858 0.5491 0.6289 0.5863 0.6080 0.7698
1.3894 13.0 585 15.5718 0.5398 0.6408 0.5860 0.6001 0.7831

Framework versions

  • Transformers 4.55.0
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
395M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aiface/ModernBERT-large_massive_modernbert_large_crf_v1

Finetuned
(164)
this model