Update README.md
Browse files
README.md
CHANGED
@@ -89,24 +89,34 @@ from ReazonSpeech, and achieves competitive CER and WER on the out-of-domain tes
|
|
89 |
the Japanese subset from [CommonVoice 8.0](https://huggingface.co/datasets/common_voice) (see [Evaluation](#evaluation) for detail).
|
90 |
|
91 |
- ***CER***
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
92 |
|
93 |
-
| Model | CommonVoice 8.0 (Japanese) | JSUT Basic 5000 | ReazonSpeech Test |
|
94 |
-
|:------------------------------------------------------------------------------------------------|---------------------------:|----------------:|------------------:|
|
95 |
-
| [**kotoba-tech/kotoba-whisper-v1.0**](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 9.44 | 8.48 | **12.60** |
|
96 |
-
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | **8.52** | **7.18** | 15.18 |
|
97 |
-
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 |
|
98 |
-
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 |
|
99 |
-
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 |
|
100 |
|
101 |
- ***WER***
|
102 |
|
103 |
-
|
|
104 |
-
|
105 |
-
| [
|
106 |
-
| [
|
107 |
-
| [openai/whisper-
|
108 |
-
| [openai/whisper-
|
109 |
-
| [openai/whisper-
|
|
|
|
|
|
|
|
|
|
|
110 |
|
111 |
- ***Latency***: As kotoba-whisper uses the same architecture as [distil-whisper/distil-large-v3](https://huggingface.co/distil-whisper/distil-large-v3),
|
112 |
it inherits the benefit of the improved latency compared to [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)
|
|
|
89 |
the Japanese subset from [CommonVoice 8.0](https://huggingface.co/datasets/common_voice) (see [Evaluation](#evaluation) for detail).
|
90 |
|
91 |
- ***CER***
|
92 |
+
| model | [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|
93 |
+
|:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
|
94 |
+
| [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0) | 9.2 | 8.4 | 11.6 |
|
95 |
+
| [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 9.4 | 8.5 | 12.2 |
|
96 |
+
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.5 | 7.1 | 14.9 |
|
97 |
+
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 9.7 | 8.2 | 28.1 |
|
98 |
+
| [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 10 | 8.9 | 34.1 |
|
99 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.5 | 10 | 33.2 |
|
100 |
+
| [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 28.6 | 24.9 | 70.4 |
|
101 |
+
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.1 | 14.2 | 41.5 |
|
102 |
+
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 53.7 | 36.5 | 137.9 |
|
103 |
+
|
104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
105 |
|
106 |
- ***WER***
|
107 |
|
108 |
+
| model | [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
|
109 |
+
|:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
|
110 |
+
| [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0) | 58.8 | 63.7 | 55.6 |
|
111 |
+
| [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 59.2 | 64.3 | 56.4 |
|
112 |
+
| [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 55.1 | 59.2 | 60.2 |
|
113 |
+
| [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 59.3 | 63.2 | 74.1 |
|
114 |
+
| [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 61.1 | 66.4 | 74.9 |
|
115 |
+
| [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 63.4 | 69.5 | 76 |
|
116 |
+
| [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 87.2 | 93 | 91.8 |
|
117 |
+
| [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 74.2 | 81.9 | 83 |
|
118 |
+
| [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 93.8 | 97.6 | 94.9 |
|
119 |
+
|
120 |
|
121 |
- ***Latency***: As kotoba-whisper uses the same architecture as [distil-whisper/distil-large-v3](https://huggingface.co/distil-whisper/distil-large-v3),
|
122 |
it inherits the benefit of the improved latency compared to [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)
|