asahi417 commited on
Commit
0bc616c
·
verified ·
1 Parent(s): 1785d49

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -9,7 +9,7 @@ pinned: false
9
  # Japanese ASR
10
 
11
  This repository contains all the models and datasets for train/evaluate the Japanese ASR dataset generated through the process of achieving [kotoba-whisper models](https://huggingface.co/collections/kotoba-tech/kotoba-whisper-661d04846a2892cc27a23921).
12
- Following table shows CER comparison with different data size of ReazonSpeech used to distill the whisper-large-v3. The model names follows
13
  `japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`.
14
 
15
  ### CER (Normalized)
@@ -20,12 +20,30 @@ Following table shows CER comparison with different data size of ReazonSpeech us
20
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) | 10.89 | 11.25 | 16.37 |
21
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 30.48 | 38.96 | 42.29 |
22
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 94.69 | 95.32 | 95.82 |
23
- | [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 9.07 | 7.43 | 11.17 |
24
  | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.52 | 7.18 | 15.18 |
 
 
 
25
  | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 |
26
  | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 |
27
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 |
 
28
 
29
  ### CER (Raw)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
 
 
9
  # Japanese ASR
10
 
11
  This repository contains all the models and datasets for train/evaluate the Japanese ASR dataset generated through the process of achieving [kotoba-whisper models](https://huggingface.co/collections/kotoba-tech/kotoba-whisper-661d04846a2892cc27a23921).
12
+ Following table shows CER comparison with different data size of ReazonSpeech used to distill [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3). The model names follows
13
  `japanese-asr/distil-whisper-large-v3-ja-reazonspeech-{size of reazonspeech}`.
14
 
15
  ### CER (Normalized)
 
20
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) | 10.89 | 11.25 | 16.37 |
21
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 30.48 | 38.96 | 42.29 |
22
  | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 94.69 | 95.32 | 95.82 |
 
23
  | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 8.52 | 7.18 | 15.18 |
24
+ | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 9.7 | 8.2 | 28.5 |
25
+ | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 10 | 8.9 | 34.4 |
26
+ | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 28.2 | 25 | 69.4 |
27
  | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 11.34 | 9.87 | 29.56 |
28
  | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 15.26 | 14.22 | 34.29 |
29
  | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 46.86 | 35.69 | 96.69 |
30
+ | [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 9.07 | 7.43 | 11.17 |
31
 
32
  ### CER (Raw)
33
+ | model | [CommonVoice 8.0](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) | [JSUT basic5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) | [ReazonSpeech Test](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
34
+ |:------------------------------------------------------------|---------------------------------------:|-------------------------------------:|----------------------------------------:|
35
+ | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-all) | 15.4 | 15.4 | 17.4 |
36
+ | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-large) | 15.5 | 15.2 | 17.8 |
37
+ | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-medium) | 17 | 18.4 | 20.2 |
38
+ | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-small) | 34.4 | 43.2 | 44.2 |
39
+ | [japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny](https://huggingface.co/japanese-asr/distil-whisper-large-v3-ja-reazonspeech-tiny) | 93.7 | 95.1 | 95.6 |
40
+ | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 12.9 | 13.4 | 20.6 |
41
+ | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 13.5 | 10.6 | 34.4 |
42
+ | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 14 | 11.2 | 40.4 |
43
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 15.4 | 13 | 39 |
44
+ | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 31.6 | 26.4 | 74.5 |
45
+ | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 19.8 | 18.8 | 47 |
46
+ | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 61.3 | 39.4 | 156.5 |
47
+ | [reazon-research/reazonspeech-nemo-v2](https://huggingface.co/reazon-research/reazonspeech-nemo-v2) | 12.6 | 10.6 | 15.4 |
48
 
49