metadata
license: mit
datasets:
- mozilla-foundation/common_voice_13_0
language:
- ca
- bg
- cs
- fi
- gl
- hi
- hu
- pl
- ro
- sk
- ta
- th
About
Multilingual Distilwhisper allows for better inference in target languages by adding lightweight CLSR modules on top of whisper-small. These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher.
Inference
Loader will be made available soon at https://github.com/naver
Citation (submitted to ICASSP 2024)
@article{ferraz2023distilwhisper,
title={DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts},
author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina},
journal={arXiv preprint arXiv:2311.01070},
year={2023}
}