Update README.md
Browse files
README.md
CHANGED
@@ -21,14 +21,11 @@ model-index:
|
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
24 |
-
**wav2vec2-large-xlsr-53**
|
25 |
|
26 |
-
- Fine-tuned on 34 hours
|
27 |
-
- Each hour of audio is pronounced by a different person.
|
28 |
-
- Transcriptions are performed by a single individual.
|
29 |
- Fine-tuning is ongoing 24/7 to enhance accuracy.
|
30 |
-
- We are consistently adding
|
31 |
-
- Audio database is organized (by sex, age, region, ..)
|
32 |
|
33 |
<table><thead><tr><th><strong>Training Loss</strong></th> <th><strong>Validation</strong></th> <th><strong>Loss Wer</strong></th></tr></thead> <tbody><tr>
|
34 |
<td>0.022800</td>
|
|
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
24 |
+
**wav2vec2-large-xlsr-53 new model**
|
25 |
|
26 |
+
- Fine-tuned on 34 hours of labeled Darija Audios extracted from MDVC corpus. MDVC Corpus contains more than 1000 hours of Moroccan Darija "ary".
|
|
|
|
|
27 |
- Fine-tuning is ongoing 24/7 to enhance accuracy.
|
28 |
+
- We are consistently adding data to the model every day (We prefer not to add all MDVC Corpus at once as we are trying to standardize the way we write this language).
|
|
|
29 |
|
30 |
<table><thead><tr><th><strong>Training Loss</strong></th> <th><strong>Validation</strong></th> <th><strong>Loss Wer</strong></th></tr></thead> <tbody><tr>
|
31 |
<td>0.022800</td>
|