ZeeshanGeoPk
/

haitian-speech-to-text

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

ZeeshanGeoPk commited on Feb 26, 2024

Commit

2416c78

·

verified ·

1 Parent(s): 5f6629c

Update README.md

Files changed (1) hide show

README.md +54 -1

README.md CHANGED Viewed

@@ -1,3 +1,56 @@
 ---
 license: apache-2.0
----

+# Haitian Speech-to-Text Model
+## Overview
+This repository contains a fine-tuned Whisper ASR (Automatic Speech Recognition) model for the Haitian language. The model is hosted on Hugging Face and is ready for use.
+## Performance
+The model achieved a Word Error Rate (WER) of 0.19126, indicating high accuracy in transcribing spoken Haitian to written text.
+## Training
+The model was trained with a learning rate of 1e-5.
+## Usage
+You can use this model directly from the Hugging Face Model Hub. Here's a simple example in Python:
+```
+from transformers import WhisperProcessor, WhisperForConditionalGeneration
+import torchaudio
+# load model and processor
+processor = WhisperProcessor.from_pretrained("ZeeshanGeoPk/haitian-speech-to-text")
+model = WhisperForConditionalGeneration.from_pretrained("ZeeshanGeoPk/haitian-speech-to-text")
+# read audio files
+sample_path = "path/to/audio.wav"
+# load audio file using torchaudio
+waveform, sample_rate = torchaudio.load(sample_path)
+# resample if needed (Whisper model requires 16kHz)
+if sample_rate != 16000:
+    resampler = torchaudio.transforms.Resample(sample_rate, 16000)
+    waveform = resampler(waveform)
+    sample_rate = 16000
+# ensure mono channel
+if waveform.shape[0] > 1:
+    waveform = waveform.mean(dim=0, keepdim=True)
+# process audio using Whisper processor
+input_features = processor(waveform.numpy(), sampling_rate=sample_rate, return_tensors="pt").input_features
+# generate token ids
+predicted_ids = model.generate(input_features)
+# decode token ids to text
+transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
+print(transcription)
+```
 ---
 license: apache-2.0
+language:
+- ht
+metrics:
+- wer
+library_name: transformers
+---