--- title: README emoji: 🐨 colorFrom: blue colorTo: yellow sdk: static pinned: false --- # Japanese ASR This repository contains all the models and datasets for train/evaluate the Japanese ASR dataset generated through the process of achieving [kotoba-whisper models](https://huggingface.co/collections/kotoba-tech/kotoba-whisper-661d04846a2892cc27a23921). Following table shows CER comparison with different data size of ReazonSpeech used to distill [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3). The model names follows `japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`. ***CER (Normalized)***: Normalized CER is the commonly used metric for Japanese ASR, where the punctuations and special characters are ignored. | model | [CommonVoice 8.0](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT basic5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech Test](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) | |:--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------:|-----------------:|--------------------:| | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all) | 9.20 | 8.40 | 11.63 | | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large) | 9.44 | 8.48 | 12.60 | | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) | 10.89 | 11.25 | 16.37 | | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 30.48 | 38.96 | 42.29 | | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 94.69 | 95.32 | 95.82 | | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.52 | 7.18 | 15.18 | | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 9.7 | 8.2 | 28.5 | | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 10 | 8.9 | 34.4 | | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 28.2 | 25 | 69.4 | | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 | | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 | | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 | | [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 9.07 | 7.43 | 11.17 |