Hubert-kakeiken-W-closed_add_ver2

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_KAKEIKEN_W_CLOSED_ADD_VER2 - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0617
  • Wer: 0.9988
  • Cer: 1.0129

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
28.4059 1.0 880 10.6721 1.0 1.1284
9.1792 2.0 1760 6.9924 1.0 1.1284
4.9143 3.0 2640 3.8166 1.0 1.1284
3.1394 4.0 3520 2.8829 1.0 1.1283
2.7266 5.0 4400 1.9608 1.0 1.1444
1.4314 6.0 5280 0.8434 0.9999 1.0662
0.6837 7.0 6160 0.4583 0.9997 1.0330
0.403 8.0 7040 0.2512 0.9991 1.0479
0.3035 9.0 7920 0.1972 0.9993 1.0365
0.229 10.0 8800 0.0872 0.9991 1.0264
0.1995 11.0 9680 0.0959 0.9988 1.0262
0.1824 12.0 10560 0.1012 0.9988 1.0317
0.1774 13.0 11440 0.0541 0.9991 1.0220
0.1739 14.0 12320 0.0703 0.9990 1.0270
0.1609 15.0 13200 0.0480 0.9988 1.0203
0.1512 16.0 14080 0.0540 0.9988 1.0162
0.1412 17.0 14960 0.0396 0.9988 1.0188
0.1391 18.0 15840 0.0493 0.9988 1.0195
0.1325 19.0 16720 0.0366 0.9988 1.0186
0.1242 20.0 17600 0.0392 0.9988 1.0178
0.122 21.0 18480 0.0545 0.9988 1.0193
0.1143 22.0 19360 0.0408 0.9988 1.0185
0.1087 23.0 20240 0.0310 0.9988 1.0176
0.1013 24.0 21120 0.0262 0.9988 1.0166
0.0998 25.0 22000 0.0388 0.9988 1.0199
0.0903 26.0 22880 0.0280 0.9988 1.0166
0.088 27.0 23760 0.0492 0.9988 1.0197
0.0838 28.0 24640 0.0230 0.9988 1.0163
0.079 29.0 25520 0.0282 0.9988 1.0170
0.0747 30.0 26400 0.0271 0.9988 1.0162
0.0692 31.0 27280 0.0272 0.9988 1.0167
0.0699 32.0 28160 0.0427 0.9988 1.0143
0.0652 33.0 29040 0.0324 0.9988 1.0162
0.0624 34.0 29920 0.0315 0.9988 1.0163
0.0588 35.0 30800 0.0549 0.9988 1.0137
0.0594 36.0 31680 0.0457 0.9988 1.0142
0.0619 37.0 32560 0.0463 0.9988 1.0144
0.058 38.0 33440 0.0665 0.9988 1.0127
0.059 39.0 34320 0.0595 0.9988 1.0131
0.0563 39.9551 35160 0.0581 0.9988 1.0133

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
6
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for utakumi/Hubert-kakeiken-W-closed_add_ver2

Finetuned
(54)
this model