naver
/

multilingual-distilwhisper-28k

Automatic Speech Recognition

text2text-generation

Model card Files Files and versions Community

multilingual-distilwhisper-28k / README.md

mzboito's picture

Update README.md (#1)

eb974bb verified 9 months ago

|

955 Bytes

	---
	license: mit
	datasets:
	- mozilla-foundation/common_voice_13_0
	language:
	- ca
	- ta
	- th
	tags:
	- automatic-speech-recognition
	inference: false
	---

	## About

	Multilingual Distilwhisper allows for better ASR performance in target languages by adding lightweight CLSR modules on top of whisper-small.
	These modules are trained on a mix of cross-entropy (ASR) and knowledge distillation losses, where whisper-large-v2 is used as teacher.

	## Inference

	Loader will be made available soon at https://github.com/naver

	## Citation
	```
	@inproceedings{ferraz2024distilwhisper,
	title={Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts},
	author={Ferraz, Thomas Palmeira and Boito, Marcely Zanon and Brun, Caroline and Nikoulina, Vassilina},
	booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
	year={2024},
	organization={IEEE}
	}
	```