w2v-bert-malayalam

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1149
  • Wer: 0.0646

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.3705 0.2758 2000 0.3227 0.3629
0.291 0.5516 4000 0.2434 0.2891
0.2695 0.8274 6000 0.2445 0.2775
0.2118 1.1032 8000 0.1979 0.2567
0.1923 1.3790 10000 0.1852 0.2213
0.1788 1.6548 12000 0.1691 0.2033
0.167 1.9306 14000 0.1870 0.1955
0.1612 2.2063 16000 0.1571 0.1731
0.1516 2.4821 18000 0.1406 0.1685
0.1597 2.7579 20000 0.1358 0.1496
0.1299 3.0336 22000 0.1332 0.1397
0.1096 3.3095 24000 0.1397 0.1384
0.1291 3.5853 26000 0.1298 0.1354
0.0975 3.8611 28000 0.1220 0.1134
0.0919 4.1368 30000 0.1261 0.1081
0.0806 4.4126 32000 0.1189 0.1120
0.0778 4.6884 34000 0.1159 0.1027
0.0922 4.9642 36000 0.1218 0.1027
0.0907 5.2400 38000 0.1099 0.0977
0.0708 5.5158 40000 0.1043 0.0920
0.0715 5.7916 42000 0.1048 0.0928
0.0646 6.0673 44000 0.1047 0.0893
0.0567 6.3431 46000 0.1294 0.0891
0.0729 6.6189 48000 0.1236 0.0873
0.0607 6.8947 50000 0.1182 0.0830
0.0555 7.1705 52000 0.1222 0.0809
0.0516 7.4463 54000 0.1145 0.0798
0.0429 7.7221 56000 0.0915 0.0763
0.0399 7.9979 58000 0.0987 0.0731
0.0373 8.2736 60000 0.1167 0.0714
0.0371 8.5494 62000 0.1130 0.0710
0.0412 8.8252 64000 0.1194 0.0707
0.0282 9.1009 66000 0.1217 0.0683
0.0284 9.3768 68000 0.1177 0.0671
0.0275 9.6526 70000 0.1117 0.0661
0.0216 9.9284 72000 0.1149 0.0646

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
129
Safetensors
Model size
606M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for cdactvm/w2v-bert-malayalam

Finetuned
(242)
this model

Space using cdactvm/w2v-bert-malayalam 1