Whisper Small Finetuned on Amma Juz of Quran

This model is a fine-tuned version of openai/whisper-small, specialized in transcribing Arabic audio with a focus on Quranic recitation from the Amma Juz dataset. This fine-tuning makes the model highly effective for tasks involving accurate recognition of Arabic speech, especially in religious and Quranic contexts.

Model Description

Whisper Small is a transformer-based model for automatic speech recognition (ASR), developed by OpenAI. By fine-tuning it on the Amma Juz dataset, this version achieves state-of-the-art results on transcribing Quranic recitations with minimal word error rates and high accuracy. The fine-tuned model retains the original capabilities of the Whisper architecture while being optimized for Arabic Quranic text.

Performance Metrics

On the evaluation set, the model achieved:

Evaluation Loss: 0.0058
Word Error Rate (WER): 1.1494%
Evaluation Runtime: 44.2766 seconds
Evaluation Samples per Second: 2.259
Evaluation Steps per Second: 0.294

These metrics demonstrate the model's efficiency and accuracy when processing Quranic recitations.

Intended Uses & Limitations

Intended Uses

Speech-to-text transcription of Arabic Quranic recitation, specifically from the Amma Juz.
Research and educational purposes in the domain of Quranic studies.
Applications in tools for learning Quranic recitation.

Limitations

The model is fine-tuned on Quranic recitation and may not perform as well on non-Quranic Arabic speech or general Arabic conversations.
Noise in audio inputs, variations in recitation style, or heavy accents might affect accuracy.
It is recommended to use clean and high-quality audio for optimal performance.

Training and Evaluation Data

The model was trained using the Amma Juz dataset, which comprises Quranic audio data and corresponding transcripts. This dataset was curated to ensure high-quality representation of Quranic recitations.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

Learning Rate: 1e-05
Training Batch Size: 16
Evaluation Batch Size: 8
Seed: 42
Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
Learning Rate Scheduler: Linear
Warmup Steps: 10
Number of Epochs: 3.0
Mixed Precision Training: Native AMP

Framework Versions

Transformers: 4.41.1
PyTorch: 2.2.1+cu121
Datasets: 2.19.1
Tokenizers: 0.19.1

fawzanaramam
/

the-truth-amma-juz