This is a conversion of Finnish-NLP/whisper-large-finnish-v3 into faster-whisper format.
This is our improved Whisper v3 model that is now finetuned from OpenAI Whisper Large V3
We improve from our previously finetuned Whisper V2 model in the following mannerhttps://huggingface.co/Finnish-NLP/whisper-large-v2-finnish
CV11 (Common Voice 11 test set) WER (Word error rate) 10.42 --> 8.23
Fleurs (A speech recognition test set by Google) WER (Word error rate) 10.20 --> 8.21
Model was trained on Nvidia RTX4080 for 32k steps with batch size 8, gradient accumulation 2
Original OpenAI Whisper Large V3
- CV11 - WER: 14.81 - WER NORMALIZED: 10.82 - CER: 2.7 - CER NORMALIZED: 2.07- Fleurs
- WER: 12.04
- WER NORMALIZED: 9.63
- CER: 2.48
- CER NORMALIZED: 3.64
After Finetuning with Finnish data our V3 got these scores on the test set:
@14000 finetuning steps
CV11
- WER: 11.36
- WER NORMALIZED: 8.31
- CER: 1.93
- CER NORMALIZED: 1.48
Fleurs
- WER: 10.2
- WER NORMALIZED: 8.56
- CER: 2.26
- CER NORMALIZED: 3.54
@32000 finetuning steps
CV11
- WER: 11.47
- WER NORMALIZED: 8.23
- CER: 1.91
- CER NORMALIZED: 1.43
Fleurs
- WER: 10.1
- WER NORMALIZED: 8.21
- CER: 2.2
- CER NORMALIZED: 3.23
- Downloads last month
- 14
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Datasets used to train mpasila/faster-whisper-large-finnish-v3
Evaluation results
- Wer on Common Voice 11.0test set self-reported8.230
- Cer on Common Voice 11.0test set self-reported1.430
- Wer on FLEURStest set self-reported8.210
- Cer on FLEURStest set self-reported3.230