meg51's picture
Upload tokenizer
a309285 verified
metadata
base_model: openai/whisper-medium
datasets:
  - google/fleurs
language:
  - hi
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: Whisper Medium Hindi -megha sharma
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Google Fleurs
          type: google/fleurs
          config: hi_in
          split: None
          args: 'config: hi, split: test'
        metrics:
          - type: wer
            value: 17.746973838344395
            name: Wer

Whisper Medium Hindi -megha sharma

This model is a fine-tuned version of openai/whisper-medium on the Google Fleurs dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4008
  • Wer: 17.7470

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 20000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.067 3.3898 1000 0.2071 20.8024
0.0116 6.7797 2000 0.2594 19.6505
0.0032 10.1695 3000 0.2891 19.0062
0.0029 13.5593 4000 0.3075 18.9086
0.0026 16.9492 5000 0.3211 19.1722
0.0033 20.3390 6000 0.3254 18.6841
0.0014 23.7288 7000 0.3304 18.2546
0.0008 27.1186 8000 0.3422 18.4889
0.0023 30.5085 9000 0.3379 18.0886
0.0009 33.8983 10000 0.3525 18.4010
0.0006 37.2881 11000 0.3511 18.0301
0.0001 40.6780 12000 0.3651 18.1863
0.0001 44.0678 13000 0.3627 17.8446
0.0 47.4576 14000 0.3775 17.6982
0.0 50.8475 15000 0.3868 17.7079
0.0 54.2373 16000 0.3944 17.7079
0.0 57.6271 17000 0.4008 17.7470

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1