Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ model-index:
|
|
17 |
metrics:
|
18 |
- name: Test WER
|
19 |
type: wer
|
20 |
-
value: 44
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
@@ -61,7 +61,21 @@ Here's the output: ڭالت ليا هاد السيد هادا ما كاينش ب
|
|
61 |
|
62 |
## Evaluation & Previous works
|
63 |
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
-v2 (fine-tuned on 9 hours of audio + replaced أ and ى and إ with ا as it creates a lot of problems + tried to standardize the Moroccan Darija)
|
67 |
|
@@ -77,7 +91,7 @@ The validation loss is still high also because the validation data contains word
|
|
77 |
|
78 |
Further training to decrease the training Loss makes this model overfit a little bit.
|
79 |
|
80 |
-
|
81 |
|
82 |
-v1 (fine-tuned on 6 hours of audio)
|
83 |
|
@@ -87,7 +101,7 @@ Further training to decrease the training Loss makes this model overfit a little
|
|
87 |
|
88 |
**Validation Loss**: 45.24
|
89 |
|
90 |
-
|
91 |
|
92 |
## Future Work
|
93 |
|
|
|
17 |
metrics:
|
18 |
- name: Test WER
|
19 |
type: wer
|
20 |
+
value: 23.44
|
21 |
---
|
22 |
# Wav2Vec2-Large-XLSR-53-Moroccan-Darija
|
23 |
|
|
|
61 |
|
62 |
## Evaluation & Previous works
|
63 |
|
64 |
+
====================================
|
65 |
+
|
66 |
+
-v3 (fine-tuned on 10 hours of audio + changed hyperparameters + discovered a huge bug when using the letter ا)
|
67 |
+
|
68 |
+
**Wer**: 23.44
|
69 |
+
|
70 |
+
**Training Loss**: 15.96
|
71 |
+
|
72 |
+
**Validation Loss**: 33.92
|
73 |
+
|
74 |
+
The validation loss is still high also because the validation data contains words that have never been trained before. The solution is to add more data and more hours of training.
|
75 |
+
|
76 |
+
Further training to decrease the training Loss makes this model overfit a little bit.
|
77 |
+
|
78 |
+
====================================
|
79 |
|
80 |
-v2 (fine-tuned on 9 hours of audio + replaced أ and ى and إ with ا as it creates a lot of problems + tried to standardize the Moroccan Darija)
|
81 |
|
|
|
91 |
|
92 |
Further training to decrease the training Loss makes this model overfit a little bit.
|
93 |
|
94 |
+
====================================
|
95 |
|
96 |
-v1 (fine-tuned on 6 hours of audio)
|
97 |
|
|
|
101 |
|
102 |
**Validation Loss**: 45.24
|
103 |
|
104 |
+
====================================
|
105 |
|
106 |
## Future Work
|
107 |
|