transiteration
/

stt_kz_quartznet15x5

Automatic Speech Recognition

Model card Files Files and versions Community

transiteration commited on Sep 6, 2023

Commit

731be01

•

1 Parent(s): 5072015

Update README.md

Files changed (1) hide show

README.md +1 -4

README.md CHANGED Viewed

@@ -54,12 +54,9 @@ Then, this model gives you the spoken words in a text format for a given audio s
 QuartzNet 15x5 [2] is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
-## Training
 The model was finetuned to Kazakh speech based on the pre-trained English Model for over several epochs.
-## Dataset
 Kazakh Speech Corpus 2 (KSC2) [3] is the first industrial-scale open-source Kazakh speech corpus.
 In total, KSC2 contains around 1.2k hours of high-quality transcribed data comprising over 600k utterances.

 QuartzNet 15x5 [2] is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
+## Training and Dataset
 The model was finetuned to Kazakh speech based on the pre-trained English Model for over several epochs.
 Kazakh Speech Corpus 2 (KSC2) [3] is the first industrial-scale open-source Kazakh speech corpus.
 In total, KSC2 contains around 1.2k hours of high-quality transcribed data comprising over 600k utterances.