Spaces:

SalahZa
/

Tunisian-ASR-v0

Build error

App Files Files Community

anonymoussubmitter222 commited on Jan 15, 2023

Commit

a03fd36

1 Parent(s): 1b934f7

added description

Browse files

Files changed (1) hide show

app.py +35 -0

app.py CHANGED Viewed

@@ -331,6 +331,41 @@ asr_brain = ASR(
     run_opts=run_opts,
     checkpointer=hparams["checkpointer"],
 )
 asr_brain.device= "cpu"
 asr_brain.modules.to("cpu")
 asr_brain.tokenizer = label_encoder

     run_opts=run_opts,
     checkpointer=hparams["checkpointer"],
 )
+description = """
+# Global description
+This is a speechbrain-based Automatic Speech Recognition (ASR) model for Tunisian arabic. It outputs tunisian transcriptions in arabic language. Since the language is unwritten, the transcriptions may vary. This model is the work of Salah Zaiem, PhD candidate, contact : [email protected]
+# Pipeline description
+This ASR system is composed of 2 different but linked blocks:
+- Acoustic model (wavlm-large + CTC). A pretrained wavlm-larhe model (https://huggingface.co/microsoft/wavlm-large) is combined with two DNN layers and finetuned on a tunisian arabic dataset.
+- KenLM based 4-gram language model, learned on the training data.
+The obtained final acoustic representation is given to the CTC greedy decoder.
+The system is trained with single channel recordings resampled at  16 khz. (The model should be good with audio resampled from 8khz)
+#Limitations
+Due to the nature of the available training data, the model may encounter issues when dealing with foreign words. So while it is common for Tunisian speakers to use (mainly french) foreign words, these will lead to more errors, we are working on improving this in further models.
+Run is done on CPU to keep it free in this space. This leads to quite long running times on long sequences. If for your project or research, you want to transcribe long sequences, feel free to drop an email here : [email protected]
+# Referencing SpeechBrain
+This work has no published paper yet, and may never have. If you use it in an academic setting, please cite the original SpeechBrain paper :
+```
+@misc{SB2021,
+    author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
+    title = {SpeechBrain},
+    year = {2021},
+    publisher = {GitHub},
+    journal = {GitHub repository},
+    howpublished = {\\\\url{https://github.com/speechbrain/speechbrain}},
+  }
+```
+"""
 asr_brain.device= "cpu"
 asr_brain.modules.to("cpu")
 asr_brain.tokenizer = label_encoder