whisper-SER-base-v7 / README.md
iFaz's picture
End of training
6e09d69 verified
metadata
library_name: transformers
language:
  - en
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
datasets:
  - iFaz/Whisper_Compatible_SER_benchmark
metrics:
  - wer
model-index:
  - name: >-
      whisper-SER-base-v7(skip_special_tokens=True during and lr = 1e-05 steps =
      12k ,warmup = 500)
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: >-
            Whisper_Compatible_SER_benchmark +
            enhanced_facebook_voxpopulik_16k_Whisper_Compatible
          type: iFaz/Whisper_Compatible_SER_benchmark
          args: 'config: en, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 56.95732838589982

whisper-SER-base-v7(skip_special_tokens=True during and lr = 1e-05 steps = 12k ,warmup = 500)

This model is a fine-tuned version of openai/whisper-base on the Whisper_Compatible_SER_benchmark + enhanced_facebook_voxpopulik_16k_Whisper_Compatible dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0978
  • Wer: 56.9573

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 12000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.3141 0.5510 1000 0.3218 42.8881
0.1626 1.1019 2000 0.2021 58.5652
0.1553 1.6529 3000 0.1462 87.1676
0.1091 2.2039 4000 0.1199 63.8528
0.1069 2.7548 5000 0.1027 63.3271
0.042 3.3058 6000 0.0958 66.8831
0.0434 3.8567 7000 0.0935 77.2418
0.0254 4.4077 8000 0.0926 64.4712
0.0265 4.9587 9000 0.0939 59.9876
0.0136 5.5096 10000 0.0955 58.2870
0.009 6.0606 11000 0.0985 62.9561
0.0067 6.6116 12000 0.0978 56.9573

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.2
  • Tokenizers 0.21.0