efficient-speech
/

lite-whisper-large-v3-turbo

Automatic Speech Recognition

feature-extraction

hf-asr-leaderboard

Model card Files Files and versions Community

lite-whisper-large-v3-turbo / README.md

nielsr's picture

nielsr HF staff

Add citation information and link to Github repo

3fba19b verified 7 days ago

|

2.39 kB

	---
	base_model: openai/whisper-large-v3-turbo
	library_name: transformers
	license: apache-2.0
	pipeline_tag: automatic-speech-recognition
	tags:
	- audio
	- automatic-speech-recognition
	- whisper
	- hf-asr-leaderboard
	---

	# Model Card for Lite-Whisper large-v3-turbo

	<!-- Provide a quick summary of what the model is/does. -->

	Lite-Whisper is a compressed version of OpenAI Whisper with LiteASR. See our [GitHub repository](https://github.com/efeslab/LiteASR) and [paper](https://arxiv.org/abs/2502.20583) for details.

	## Benchmark Results

	Following is the average word error rate (WER) evaluated on the [ESB datasets](https://huggingface.co/datasets/hf-audio/esb-datasets-test-only-sorted):

	\| Model \| Average WER (↓) \| Encoder Size \| Decoder Size \|
	\|-------\|----------------\|--------------\|--------------\|
	\| [whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) \| 10.1 \| 635M \| 907M \|
	\| [lite-whisper-large-v3-acc](https://huggingface.co/efficient-speech/lite-whisper-large-v3-acc) \| 10.1 \| 429M \| 907M \|
	\| [lite-whisper-large-v3](https://huggingface.co/efficient-speech/lite-whisper-large-v3) \| 10.2 \| 377M \| 907M \|
	\| [lite-whisper-large-v3-fast](https://huggingface.co/efficient-speech/lite-whisper-large-v3-fast) \| 11.3 \| 308M \| 907M \|
	\|   \|   \|   \|   \|
	\| [whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) \| 10.1 \| 635M \| 172M \|
	\| [lite-whisper-large-v3-turbo-acc](https://huggingface.co/efficient-speech/lite-whisper-large-v3-turbo-acc) \| 10.2 \| 421M \| 172M \|
	\| [lite-whisper-large-v3-turbo](https://huggingface.co/efficient-speech/lite-whisper-large-v3-turbo) \| 12.6 \| 374M \| 172M \|
	\| [lite-whisper-large-v3-turbo-fast](https://huggingface.co/efficient-speech/lite-whisper-large-v3-turbo-fast) \| 20.1 \| 313M \| 172M \|
	\|   \|   \|   \|   \|
	\| [whisper-medium](https://huggingface.co/openai/whisper-medium) \| 14.8 \| 306M \| 457M \|

	## Citation

	If you use LiteASR in your research, please cite the following paper:

	```
	@misc{kamahori2025liteasrefficientautomaticspeech,
	title={LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation},
	author={Keisuke Kamahori and Jungo Kasai and Noriyuki Kojima and Baris Kasikci},
	year={2025},
	eprint={2502.20583},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2502.20583},
	}
	```