Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,8 @@ This repository contains all the models and datasets for train/evaluate the Japa
|
|
12 |
Following table shows CER comparison with different data size of ReazonSpeech used to distill [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3). The model names follows
|
13 |
`japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`.
|
14 |
|
15 |
-
***CER
|
|
|
16 |
| model | [CommonVoice 8.0](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT basic5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech Test](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|
17 |
|:--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------:|-----------------:|--------------------:|
|
18 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all) | 9.20 | 8.40 | 11.63 |
|
@@ -21,29 +22,12 @@ Following table shows CER comparison with different data size of ReazonSpeech us
|
|
21 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 30.48 | 38.96 | 42.29 |
|
22 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 94.69 | 95.32 | 95.82 |
|
23 |
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.52 | 7.18 | 15.18 |
|
24 |
-
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)
|
25 |
-
| [openai/whisper-large](https://huggingface.co/openai/whisper-large)
|
26 |
-
| [openai/whisper-base](https://huggingface.co/openai/whisper-base)
|
27 |
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 |
|
28 |
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 |
|
29 |
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 |
|
30 |
| [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 9.07 | 7.43 | 11.17 |
|
31 |
|
32 |
-
|
33 |
-
| model | [CommonVoice 8.0](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT basic5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech Test](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|
34 |
-
|:------------------------------------------------------------|---------------------------------------:|-------------------------------------:|----------------------------------------:|
|
35 |
-
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all) | 15.4 | 15.4 | 17.4 |
|
36 |
-
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large) | 15.5 | 15.2 | 17.8 |
|
37 |
-
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) | 17 | 18.4 | 20.2 |
|
38 |
-
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 34.4 | 43.2 | 44.2 |
|
39 |
-
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 93.7 | 95.1 | 95.6 |
|
40 |
-
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 12.9 | 13.4 | 20.6 |
|
41 |
-
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 13.5 | 10.6 | 34.4 |
|
42 |
-
| [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 14 | 11.2 | 40.4 |
|
43 |
-
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 15.4 | 13 | 39 |
|
44 |
-
| [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 31.6 | 26.4 | 74.5 |
|
45 |
-
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 19.8 | 18.8 | 47 |
|
46 |
-
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 61.3 | 39.4 | 156.5 |
|
47 |
-
| [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 12.6 | 10.6 | 15.4 |
|
48 |
-
|
49 |
-
-->
|
|
|
12 |
Following table shows CER comparison with different data size of ReazonSpeech used to distill [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3). The model names follows
|
13 |
`japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`.
|
14 |
|
15 |
+
***CER***
|
16 |
+
|
17 |
| model | [CommonVoice 8.0](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT basic5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech Test](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|
18 |
|:--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------:|-----------------:|--------------------:|
|
19 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all) | 9.20 | 8.40 | 11.63 |
|
|
|
22 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 30.48 | 38.96 | 42.29 |
|
23 |
| [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 94.69 | 95.32 | 95.82 |
|
24 |
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.52 | 7.18 | 15.18 |
|
25 |
+
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 9.70 | 8.20 | 28.50 |
|
26 |
+
| [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 10.00 | 8.90 | 34.40 |
|
27 |
+
| [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 28.20 | 25.00 | 69.40 |
|
28 |
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 |
|
29 |
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 |
|
30 |
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 |
|
31 |
| [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 9.07 | 7.43 | 11.17 |
|
32 |
|
33 |
+
Please find more detailed results at [kotoba-whisper codebase](https://github.com/kotoba-tech/kotoba-whisper).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|