library_name: transformers
license: openrail
datasets:
- alexandrainst/coral
language:
- da
metrics:
- wer
- cer
base_model:
- openai/whisper-large-v3
pipeline_tag: automatic-speech-recognition
model-index:
- name: coral-1-whisper-large
results:
- task:
type: automatic-speech-recognition
name: Automatic Speech Recognition
dataset:
name: CoRal read-aloud
type: alexandrainst/coral
split: test
args: read_aloud
metrics:
- type: cer
value: 4.3% ± 0.2%
name: CER
- type: wer
value: 10.4% ± 0.3%
name: WER
Whisper-Large v.3 trained on CoRaL release 1
This is a Danish state-of-the-art speech recognition model, trained by Alvenir.
Evaluation Results
Model | Number of parameters | CoRal CER | CoRal WER |
---|---|---|---|
Alvenir/coral-1-whisper-large | 1540M | 4.3% ± 0.2% | 10.4% ± 0.3% |
alexandrainst/roest-315m | 315M | 6.6% ± 0.2% | 17.0% ± 0.4% |
mhenrichsen/hviske-v2 | 1540M | 4.7% ± 0.07% | 11.8% ± 0.3% |
openai/whisper-large-v3 | 1540M | 11.4% ± 0.3% | 28.3% ± 0.6% |
Results of more models and more datasets can be seen in the model card for Røst-315m.
Model details
This is simply the Whisper Large v.3 model trained on the first release of CoRaL data.
The model was trained for 30K steps using the configuration from the CoRaL repository by running:
python src/scripts/finetune_asr_model.py model=whisper-large max_steps=30000 model.learning_rate=1e-5
License
Note that the dataset used is licensed under a custom license, adapted from OpenRAIL-M, which allows commercial use with a few restrictions (speech synthesis and biometric identification). See license.
Creators and Funders
The CoRal project is funded by the Danish Innovation Fund and consists of the following partners:
We would like specifically thank Dan Saattrup Nielsen, Alexandra Institute for (among other things) the repository work and Simon Leminen Madsen, Alexandra Institute for modelling work.