Vira21
/

Whisper-Base-KhmerV2

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Vira21 commited on Nov 2, 2024

Commit

c708134

·

verified ·

1 Parent(s): 0da6bc7

Update README.md

Files changed (1) hide show

README.md +25 -1

README.md CHANGED Viewed

@@ -12,4 +12,28 @@ base_model:
 new_version: Vira21/Whisper-Base-KhmerV2
 pipeline_tag: automatic-speech-recognition
 library_name: transformers
----

 new_version: Vira21/Whisper-Base-KhmerV2
 pipeline_tag: automatic-speech-recognition
 library_name: transformers
+---
+# Whisper-Base-KhmerV2
+This model is a fine-tuned variant of [openai/whisper-base](https://huggingface.co/openai/whisper-base), specifically adapted to enhance performance on diverse datasets. Designed to deliver improved transcription accuracy across multiple languages, including Khmer, it is fine-tuned with a focus on understanding the nuances of non-English languages and dialects.
+Explore its capabilities in real-time transcription and multilingual support in the demo space: [Whisper-Base-Khmer Demo](https://huggingface.co/spaces/Vira21/Whisper-Base-Khmer).
+## Model Overview
+- **Base Model**: OpenAI Whisper
+- **Language**: Khmer
+- **Datasets Used**:
+  - Google Fleurs
+  - OpenSLR
+  - Khmer Kheng Info Speech (seanghay/khmer_kheng_info_speech)
+  - KM-Speech-Corpus (seanghay/km-speech-corpus)
+  - Khmer Grkpp Speech (seanghay/khmer_grkpp_speech)
+  - English Dialects (ylacombe/english_dialects)
+- **Metrics**:
+  - **WER (Word Error Rate)**: 0.4529
+  - **Training Loss**: 0.1012