Hubert-kakeiken-W-office

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_KAKEIKEN_W_OFFICE - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0194
  • Wer: 0.9988
  • Cer: 1.0151

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
25.4955 1.0 820 10.0573 1.0 1.1284
8.4563 2.0 1640 7.0638 1.0 1.1284
6.5293 3.0 2460 4.0992 1.0 1.1285
3.6116 4.0 3280 2.9955 1.0 1.1284
2.6876 5.0 4100 2.4298 1.0 1.1284
2.2577 6.0 4920 1.2720 1.0 1.0414
0.987 7.0 5740 0.5797 0.9999 1.0195
0.482 8.0 6560 0.2601 0.9993 1.0342
0.3807 9.0 7380 0.1617 0.9991 1.0305
0.2928 10.0 8200 0.1036 0.9988 1.0228
0.2525 11.0 9020 0.0632 0.9988 1.0229
0.2244 12.0 9840 0.0649 0.9988 1.0276
0.2248 13.0 10660 0.0414 0.9991 1.0194
0.2071 14.0 11480 0.0473 0.9990 1.0212
0.2003 15.0 12300 0.0426 0.9991 1.0206
0.1921 16.0 13120 0.0603 0.9991 1.0241
0.1853 17.0 13940 0.0346 0.9990 1.0208
0.1725 18.0 14760 0.0320 0.9990 1.0192
0.1663 19.0 15580 0.0343 0.9990 1.0189
0.1665 20.0 16400 0.0245 0.9990 1.0172
0.1567 21.0 17220 0.0254 0.9991 1.0176
0.1494 22.0 18040 0.0416 0.9990 1.0231
0.1453 23.0 18860 0.0267 0.9988 1.0162
0.1365 24.0 19680 0.0240 0.9988 1.0161
0.1386 25.0 20500 0.0202 0.9988 1.0163
0.1283 26.0 21320 0.0269 0.9988 1.0159
0.1291 27.0 22140 0.0258 0.9988 1.0150
0.1168 28.0 22960 0.0164 0.9988 1.0157
0.1173 29.0 23780 0.0202 0.9988 1.0154
0.113 30.0 24600 0.0203 0.9988 1.0161
0.108 31.0 25420 0.0291 0.9988 1.0167
0.0994 32.0 26240 0.0230 0.9988 1.0161
0.0946 33.0 27060 0.0180 0.9988 1.0148
0.0945 34.0 27880 0.0214 0.9988 1.0156
0.0909 35.0 28700 0.0225 0.9990 1.0157
0.0904 36.0 29520 0.0185 0.9988 1.0153
0.0859 37.0 30340 0.0188 0.9988 1.0152
0.0866 38.0 31160 0.0194 0.9988 1.0152
0.0846 39.0 31980 0.0191 0.9988 1.0151
0.0865 39.9518 32760 0.0192 0.9988 1.0151

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
94.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for utakumi/Hubert-kakeiken-W-office

Finetuned
(47)
this model