Update README.md
Browse files
README.md
CHANGED
@@ -12,4 +12,28 @@ base_model:
|
|
12 |
new_version: Vira21/Whisper-Base-KhmerV2
|
13 |
pipeline_tag: automatic-speech-recognition
|
14 |
library_name: transformers
|
15 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
new_version: Vira21/Whisper-Base-KhmerV2
|
13 |
pipeline_tag: automatic-speech-recognition
|
14 |
library_name: transformers
|
15 |
+
---
|
16 |
+
|
17 |
+
# Whisper-Base-KhmerV2
|
18 |
+
|
19 |
+
|
20 |
+
This model is a fine-tuned variant of [openai/whisper-base](https://huggingface.co/openai/whisper-base), specifically adapted to enhance performance on diverse datasets. Designed to deliver improved transcription accuracy across multiple languages, including Khmer, it is fine-tuned with a focus on understanding the nuances of non-English languages and dialects.
|
21 |
+
|
22 |
+
Explore its capabilities in real-time transcription and multilingual support in the demo space: [Whisper-Base-Khmer Demo](https://huggingface.co/spaces/Vira21/Whisper-Base-Khmer).
|
23 |
+
|
24 |
+
|
25 |
+
|
26 |
+
## Model Overview
|
27 |
+
|
28 |
+
- **Base Model**: OpenAI Whisper
|
29 |
+
- **Language**: Khmer
|
30 |
+
- **Datasets Used**:
|
31 |
+
- Google Fleurs
|
32 |
+
- OpenSLR
|
33 |
+
- Khmer Kheng Info Speech (seanghay/khmer_kheng_info_speech)
|
34 |
+
- KM-Speech-Corpus (seanghay/km-speech-corpus)
|
35 |
+
- Khmer Grkpp Speech (seanghay/khmer_grkpp_speech)
|
36 |
+
- English Dialects (ylacombe/english_dialects)
|
37 |
+
- **Metrics**:
|
38 |
+
- **WER (Word Error Rate)**: 0.4529
|
39 |
+
- **Training Loss**: 0.1012
|