fawzanaramam
/

the-truth-amma-juz-medium

Automatic Speech Recognition

Model card Files Files and versions Metrics Training metrics Community

the-truth-amma-juz-medium / README.md

fawzanaramam's picture

Update README.md

4cf9d04 verified 3 months ago

|

history blame contribute delete

3.14 kB

	---
	language:
	- ar
	license: apache-2.0
	base_model: openai/whisper-medium
	tags:
	- fine-tuned
	- Quran
	- automatic-speech-recognition
	- arabic
	- whisper
	datasets:
	- fawzanaramam/the-amma-juz
	model-index:
	- name: Whisper Medium Finetuned on Amma Juz of Quran
	results:
	- task:
	type: automatic-speech-recognition
	name: Speech Recognition
	dataset:
	name: The Amma Juz Dataset
	type: fawzanaramam/the-amma-juz
	metrics:
	- type: eval_loss
	value: 0.0032
	- type: eval_wer
	value: 0.5102
	---

	# Whisper Medium Finetuned on Amma Juz of Quran

	This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium), tailored for transcribing Arabic audio with a focus on Quranic recitation from the Amma Juz dataset. It is optimized for high accuracy and minimal word error rates in Quranic transcription tasks.

	## Model Description

	Whisper Medium is a transformer-based automatic speech recognition (ASR) model developed by OpenAI. This fine-tuned version leverages the Amma Juz dataset to enhance performance in recognizing Quranic recitations. The model is particularly effective for Arabic speech transcription in religious contexts, while retaining Whisper's general-purpose ASR capabilities.

	## Performance Metrics

	On the evaluation set, the model achieved:
	- Evaluation Loss: 0.0032
	- Word Error Rate (WER): 0.5102%
	- Evaluation Runtime: 47.9061 seconds
	- Evaluation Samples per Second: 2.087
	- Evaluation Steps per Second: 0.271

	These metrics demonstrate the model's superior accuracy and efficiency, making it suitable for applications requiring high-quality Quranic transcription.

	## Intended Uses & Limitations

	### Intended Uses
	- Speech-to-text transcription of Quranic recitation in Arabic, specifically from the Amma Juz.
	- Research and development of tools for Quranic education and learning.
	- Projects focused on Arabic ASR in religious and educational domains.

	### Limitations
	- The model is fine-tuned on Quranic recitations and may not generalize well to non-Quranic Arabic speech or casual conversations.
	- Variations in recitation style, audio quality, or heavy accents may impact transcription accuracy.
	- For optimal performance, use clean and high-quality audio inputs.

	## Training and Evaluation Data

	The model was trained using the Amma Juz dataset, which includes Quranic audio recordings and corresponding transcripts. The dataset was carefully curated to ensure the integrity and accuracy of Quranic content.

	## Training Procedure

	### Training Hyperparameters
	The following hyperparameters were used during training:
	- Learning Rate: 1e-05
	- Training Batch Size: 16
	- Evaluation Batch Size: 8
	- Seed: 42
	- Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
	- Learning Rate Scheduler: Linear
	- Warmup Steps: 10
	- Number of Epochs: 3.0
	- Mixed Precision Training: Native AMP

	### Framework Versions
	- Transformers: 4.41.1
	- PyTorch: 2.2.1+cu121
	- Datasets: 2.19.1
	- Tokenizers: 0.19.1