naver
/

multilingual-distilwhisper-3k

Automatic Speech Recognition

text2text-generation

Model card Files Files and versions Community

multilingual-distilwhisper-3k / README.md

mzboito's picture

Update README.md

d3b86c8 over 1 year ago

|

856 Bytes

	---
	license: mit
	datasets:
	- mozilla-foundation/common_voice_13_0
	language:
	- ca
	- bg
	- cs
	- fi
	- gl
	- hi
	- hu
	- pl
	- ro
	- sk
	- ta
	- th
	---

	## About

	Multilingual Distilwhisper allows for better inference in target languages by adding lightweight CLSR modules on top of whisper-small.
	These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher.

	## Inference

	Loader will be made available soon at https://github.com/naver

	## Citation (submitted to ICASSP 2024)
	```
	@article{ferraz2023distilwhisper,
	title={DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts},
	author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina},
	journal={arXiv preprint arXiv:2311.01070},
	year={2023}
	}
	```