Baselhany's picture
Training finished
31cdd7f verified
|
raw
history blame
2.53 kB
metadata
library_name: transformers
language:
  - ar
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper base AR - BH
    results: []

Whisper base AR - BH

This model is a fine-tuned version of openai/whisper-base on the quran-ayat-speech-to-text dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0124
  • Wer: 11.8561
  • Cer: 3.6023

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
0.0124 0.2895 800 6.9720 0.0166 21.3510
0.0076 0.5790 1600 4.4857 0.0124 14.3371
0.0042 0.8685 2400 4.2342 0.0112 13.1816
0.0053 1.1581 3200 4.8224 0.0133 14.4143
0.0041 1.4476 4000 4.0206 0.0121 12.9768
0.0023 1.7371 4800 3.7118 0.0116 11.9643
0.0022 2.0268 5600 4.0467 0.0125 12.7101
0.002 2.3163 6400 3.7803 0.0125 12.1962
0.0016 2.6058 7200 3.7763 0.0124 12.2696
0.0018 2.8952 8000 3.6627 0.0122 12.0570
0.0013 3.1849 8800 0.0126 12.0957 3.6893
0.0015 3.4744 9600 0.0126 12.2232 3.6893
0.0013 3.7639 10400 0.0124 11.8561 3.6023

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.1
  • Tokenizers 0.21.0