metadata

library_name: transformers
language:
  - da
license: apache-2.0
base_model: openai/whisper-large
tags:
  - hf-asr-leaderboard
  - generated_from_trainer
datasets:
  - alexandrainst/ftspeech
metrics:
  - wer
model-index:
  - name: Whisper small FTSpeech - Julie
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: ftspeech
          type: alexandrainst/ftspeech
          args: 'split: test'
        metrics:
          - name: Wer
            type: wer
            value: 19.463820660777202

Whisper small FTSpeech - Julie

This model is a fine-tuned version of openai/whisper-large on the ftspeech dataset. It achieves the following results on the evaluation set:

Loss: 0.2781
Wer: 19.4638

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
training_steps: 5000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.4214	0.0080	500	0.4317	26.8590
0.3568	0.0161	1000	0.3763	24.5151
0.3443	0.0241	1500	0.3443	23.0618
0.3218	0.0321	2000	0.3275	22.0048
0.2851	0.0402	2500	0.3139	21.2409
0.2638	0.0482	3000	0.3021	20.4187
0.2515	0.0562	3500	0.2943	20.2420
0.2692	0.0643	4000	0.2864	19.9020
0.2503	0.0723	4500	0.2806	19.6671
0.2396	0.0803	5000	0.2781	19.4638

Framework versions

Transformers 4.47.0
Pytorch 2.5.1
Datasets 3.1.0
Tokenizers 0.21.0