|
--- |
|
language: "ar" |
|
pipeline_tag: automatic-speech-recognition |
|
tags: |
|
- CTC |
|
- Attention |
|
- pytorch |
|
- Transformer |
|
license: "cc-by-nc-4.0" |
|
datasets: |
|
- MGB-3 |
|
- egyptian-arabic-conversational-speech-corpus |
|
metrics: |
|
- wer |
|
model-index: |
|
- name: omarxadel/hubert-large-arabic-egyptian |
|
results: |
|
- task: |
|
name: Automatic Speech Recognition |
|
type: automatic-speech-recognition |
|
metrics: |
|
- name: Test WER |
|
type: wer |
|
value: 25.9 |
|
- name: Validation WER |
|
type: wer |
|
value: 23.5 |
|
--- |
|
|
|
# Arabic Hubert-Large - with CTC fine-tuned on MGB-3 and Egyptian Arabic Conversational Speech Corpus (No LM) |
|
|
|
This model is a fine-tuned version of [Arabic Hubert-Large](https://huggingface.co/asafaya/hubert-large-arabic). We finetuned this model on the MGB-3 and Egyptian Arabic Conversational Speech Corpus datasets, acheiving a state of the art for Egyptian Arabic with WER of `25.9%`. |
|
|
|
The original model was pre-trained on 2,000 hours of 16kHz sampled Arabic speech audio. When using the model make sure that your speech input is also sampled at 16Khz, see the original [paper](https://arxiv.org/abs/2106.07447) for more details on the model. |
|
|
|
The performance of the model on the datasets is the following: |
|
|
|
| Valid WER | Test WER | |
|
|:---------:|:--------:| |
|
| 23.55 | 25.59 | |
|
|
|
# Acknowledgement |
|
|
|
Model fine-tuning and data processing for this work were performed as a part of a Graduation Project from Faculty of Engineering, Alexandria University, CCE Program. |