whisper-small-am / README.md
surafelabebe's picture
Update README.md
7c00c63 verified
metadata
library_name: transformers
language:
  - am
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_17_0
  - surafelabebe/fleurs_am
metrics:
  - wer
model-index:
  - name: Whisper Small Am - Surafel Worku
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 17.0
          type: mozilla-foundation/common_voice_17_0
          args: 'config: am, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 50.96566523605151

Whisper Small Am - Surafel Worku

This model is a fine-tuned version of openai/whisper-small on the Common Voice 17.0 dataset and surafelabebe/fleurs_am (a subset of google/fleurs). It achieves the following results on the evaluation set:

  • Loss: 0.4352
  • Wer: 50.9657

Model description

This model was trained for 10 hours. Training results indicate potential overfitting. Future improvements will focus on mitigating this by incorporating a larger dataset, extended training epochs, and dropout regularization.

Usage

from transformers import pipeline
# import gradio as gr

pipe = pipeline(model="surafelabebe/whisper-small-am")

text = pipe("sample.wav")["text"]  # change to "your audio file name"

print(text)
from datasets import load_dataset
from IPython.display import Audio

dataset = load_dataset("surafelabebe/sample_tts_audio")
sample = dataset["train"][10]["audio"]
Audio(data=sample["array"], rate=sample["sampling_rate"])
Input Output
0.0108 ለአምባቢዎች አውምሮት ምርትን ለልባቸው ደስታን የሚሰት ልብ ወለ ድርሰት ትሩ ድርሰት ይባላል

Training procedure

The Fine-tuning steps were similar to what is explained in this blogpost

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.0108 9.6154 1000 0.3446 54.9759
0.0009 19.2308 2000 0.4052 51.7570
0.0001 28.8462 3000 0.4277 50.9388
0.0001 38.4615 4000 0.4352 50.9657

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0