KoModernBERT-chp-01

This model is a fine-tuned version of CocoRoF/KoModernBERT on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1915

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 512
  • total_eval_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
18.3369 0.0904 5000 2.3348
18.1338 0.1808 10000 2.3040
18.4136 0.2712 15000 2.2834
17.9531 0.3616 20000 2.2649
17.8586 0.4520 25000 2.2476
17.8711 0.5424 30000 2.2407
17.9052 0.6329 35000 2.2233
17.8385 0.7233 40000 2.2143
17.7234 0.8137 45000 2.2101
17.2833 0.9041 50000 2.2030
17.8717 0.9945 55000 2.1915

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
153M params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for CocoRoF/KoModernBERT-chp-01

Unable to build the model tree, the base model loops to the model itself. Learn more.