fawzanaramam
/

the-truth-amma-juz-medium

@@ -4,59 +4,80 @@ language:
 license: apache-2.0
 base_model: openai/whisper-medium
 tags:
-- generated_from_trainer
 datasets:
 - fawzanaramam/the-amma-juz
 model-index:
 - name: Whisper Medium Finetuned on Amma Juz of Quran
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # Whisper Medium Finetuned on Amma Juz of Quran
-This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on the The Truth Amma Juz dataset.
-It achieves the following results on the evaluation set:
-- eval_loss: 0.0032
-- eval_wer: 0.5102
-- eval_runtime: 47.9061
-- eval_samples_per_second: 2.087
-- eval_steps_per_second: 0.271
-- epoch: 0.6653
-- step: 950
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 16
-- eval_batch_size: 8
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 10
-- num_epochs: 3.0
-- mixed_precision_training: Native AMP
-### Framework versions
-- Transformers 4.41.1
-- Pytorch 2.2.1+cu121
-- Datasets 2.19.1
-- Tokenizers 0.19.1

 license: apache-2.0
 base_model: openai/whisper-medium
 tags:
+- fine-tuned
+- Quran
+- automatic-speech-recognition
+- arabic
+- whisper
 datasets:
 - fawzanaramam/the-amma-juz
 model-index:
 - name: Whisper Medium Finetuned on Amma Juz of Quran
+  results:
+  - task:
+      type: automatic-speech-recognition
+      name: Speech Recognition
+    dataset:
+      name: The Amma Juz Dataset
+      type: fawzanaramam/the-amma-juz
+    metrics:
+      - type: eval_loss
+        value: 0.0032
+      - type: eval_wer
+        value: 0.5102
 ---
 # Whisper Medium Finetuned on Amma Juz of Quran
+This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium), tailored for transcribing Arabic audio with a focus on Quranic recitation from the *Amma Juz* dataset. It is optimized for high accuracy and minimal word error rates in Quranic transcription tasks.
+## Model Description
+Whisper Medium is a transformer-based automatic speech recognition (ASR) model developed by OpenAI. This fine-tuned version leverages the *Amma Juz* dataset to enhance performance in recognizing Quranic recitations. The model is particularly effective for Arabic speech transcription in religious contexts, while retaining Whisper's general-purpose ASR capabilities.
+## Performance Metrics
+On the evaluation set, the model achieved:
+- **Evaluation Loss**: 0.0032
+- **Word Error Rate (WER)**: 0.5102%
+- **Evaluation Runtime**: 47.9061 seconds
+- **Evaluation Samples per Second**: 2.087
+- **Evaluation Steps per Second**: 0.271
+These metrics demonstrate the model's superior accuracy and efficiency, making it suitable for applications requiring high-quality Quranic transcription.
+## Intended Uses & Limitations
+### Intended Uses
+- **Speech-to-text transcription** of Quranic recitation in Arabic, specifically from the *Amma Juz*.
+- Research and development of tools for Quranic education and learning.
+- Projects focused on Arabic ASR in religious and educational domains.
+### Limitations
+- The model is fine-tuned on Quranic recitations and may not generalize well to non-Quranic Arabic speech or casual conversations.
+- Variations in recitation style, audio quality, or heavy accents may impact transcription accuracy.
+- For optimal performance, use clean and high-quality audio inputs.
+## Training and Evaluation Data
+The model was trained using the *Amma Juz* dataset, which includes Quranic audio recordings and corresponding transcripts. The dataset was carefully curated to ensure the integrity and accuracy of Quranic content.
+## Training Procedure
+### Training Hyperparameters
 The following hyperparameters were used during training:
+- **Learning Rate**: 1e-05
+- **Training Batch Size**: 16
+- **Evaluation Batch Size**: 8
+- **Seed**: 42
+- **Optimizer**: Adam (betas=(0.9, 0.999), epsilon=1e-08)
+- **Learning Rate Scheduler**: Linear
+- **Warmup Steps**: 10
+- **Number of Epochs**: 3.0
+- **Mixed Precision Training**: Native AMP
+### Framework Versions
+- **Transformers**: 4.41.1
+- **PyTorch**: 2.2.1+cu121
+- **Datasets**: 2.19.1
+- **Tokenizers**: 0.19.1