transiteration
commited on
Commit
•
731be01
1
Parent(s):
5072015
Update README.md
Browse files
README.md
CHANGED
@@ -54,12 +54,9 @@ Then, this model gives you the spoken words in a text format for a given audio s
|
|
54 |
|
55 |
QuartzNet 15x5 [2] is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
|
56 |
|
57 |
-
## Training
|
58 |
|
59 |
The model was finetuned to Kazakh speech based on the pre-trained English Model for over several epochs.
|
60 |
-
|
61 |
-
## Dataset
|
62 |
-
|
63 |
Kazakh Speech Corpus 2 (KSC2) [3] is the first industrial-scale open-source Kazakh speech corpus.
|
64 |
In total, KSC2 contains around 1.2k hours of high-quality transcribed data comprising over 600k utterances.
|
65 |
|
|
|
54 |
|
55 |
QuartzNet 15x5 [2] is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
|
56 |
|
57 |
+
## Training and Dataset
|
58 |
|
59 |
The model was finetuned to Kazakh speech based on the pre-trained English Model for over several epochs.
|
|
|
|
|
|
|
60 |
Kazakh Speech Corpus 2 (KSC2) [3] is the first industrial-scale open-source Kazakh speech corpus.
|
61 |
In total, KSC2 contains around 1.2k hours of high-quality transcribed data comprising over 600k utterances.
|
62 |
|