metadata

library_name: transformers
license: openrail
datasets:
  - alexandrainst/coral
language:
  - da
metrics:
  - wer
  - cer
base_model:
  - openai/whisper-large-v3
pipeline_tag: automatic-speech-recognition
model-index:
  - name: coral-1-whisper-large
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: CoRal read-aloud
          type: alexandrainst/coral
          split: test
          args: read_aloud
        metrics:
          - type: cer
            value: 4.3% ± 0.2%
            name: CER
          - type: wer
            value: 10.4% ± 0.3%
            name: WER

Whisper-Large v.3 trained on CoRaL release 1

This is a Danish state-of-the-art speech recognition model, trained by Alvenir.

Evaluation Results

Model	Number of parameters	CoRal CER	CoRal WER
Alvenir/coral-1-whisper-large	1540M	4.3% ± 0.2%	10.4% ± 0.3%
alexandrainst/roest-315m	315M	6.6% ± 0.2%	17.0% ± 0.4%
mhenrichsen/hviske-v2	1540M	4.7% ± 0.07%	11.8% ± 0.3%
openai/whisper-large-v3	1540M	11.4% ± 0.3%	28.3% ± 0.6%

Results of more models and more datasets can be seen in the model card for Røst-315m.

Model details

This is simply the Whisper Large v.3 model trained on the first release of CoRaL data.

The model was trained for 30K steps using the configuration from the CoRaL repository by running:


python src/scripts/finetune_asr_model.py model=whisper-large max_steps=30000 model.learning_rate=1e-5

License

Note that the dataset used is licensed under a custom license, adapted from OpenRAIL-M, which allows commercial use with a few restrictions (speech synthesis and biometric identification). See license.

Creators and Funders

The CoRal project is funded by the Danish Innovation Fund and consists of the following partners:

We would like specifically thank Dan Saattrup Nielsen, Alexandra Institute for (among other things) the repository work and Simon Leminen Madsen, Alexandra Institute for modelling work.