whisper-small-dv / README.md
ruhullah1's picture
End of training
18a205a verified
|
raw
history blame
2.49 kB
metadata
language:
  - dv
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_13_0
metrics:
  - wer
model-index:
  - name: Whisper Small Dv - Ruhullah Shaikh
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 13
          type: mozilla-foundation/common_voice_13_0
          config: dv
          split: test
          args: dv
        metrics:
          - name: Wer
            type: wer
            value: 10.97645790590117

Whisper Small Dv - Ruhullah Shaikh

This model is a fine-tuned version of openai/whisper-small on the Common Voice 13 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3049
  • Wer Ortho: 57.3995
  • Wer: 10.9765

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.1224 1.6313 500 0.1725 63.0197 13.4872
0.0448 3.2626 1000 0.1690 58.1378 11.5189
0.0297 4.8940 1500 0.1814 60.0251 11.5450
0.006 6.5253 2000 0.2352 58.2701 11.3503
0.0018 8.1566 2500 0.2639 58.3676 11.1364
0.0008 9.7879 3000 0.2888 57.7686 11.0738
0.0002 11.4192 3500 0.3015 57.3369 10.9938
0.0002 13.0506 4000 0.3049 57.3995 10.9765

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.4.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1