songhoy-asr / README.md
sudoping01's picture
Update README.md
05ce87a verified
---
library_name: peft
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- automatic-speech-recognition
- whisper
- asr
- songhoy
- hsn
- Mali
- MALIBA-AI
- lora
- fine-tuned
- code-switching
- african-language
language:
- hsn
- fr
language_bcp47:
- hsn-ML
- fr-ML
model-index:
- name: songhoy-asr-v1
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: songhoy-asr
type: custom
split: test
args:
language: hsn
metrics:
- name: WER
type: wer
value: 16.58
- name: CER
type: cer
value: 4.63
pipeline_tag: automatic-speech-recognition
---
# Songhoy-ASR-v1: First Open-Source Speech Recognition Model for Songhoy
Songhoy-ASR-v1 represents a historic milestone as the **first open-source speech recognition model** for Songhoy, a language spoken by over 3 million people across Mali, Niger, and Burkina Faso. Developed as part of the MALIBA-AI initiative, this groundbreaking model not only achieves impressive accuracy but opens the door to speech technology for Songhoy speakers for the very first time.
## Model Overview
This model demonstrates exceptional performance for Songhoy speech recognition, with particularly strong capabilities in:
- **Pure Songhoy recognition**: Accurate transcription of traditional and contemporary Songhoy speech
- **Code-switching handling**: Effectively manages the natural mixing of Songhoy with French
- **Dialect adaptation**: Works across regional variations of Songhoy
- **Noise resilience**: Maintains accuracy even with moderate background noise
## Impressive Performance Metrics
Songhoy-ASR-v1 achieves breakthrough results on our test dataset:
| Metric | Value |
|--------|-------|
| Word Error Rate (WER) | 16.58% |
| Character Error Rate (CER) | 4.63% |
These results represent the best publicly available performance for Songhoy speech recognition, making this model suitable for production applications.
## Technical Details
The model is a fine-tuned version of OpenAI's Whisper-large-v2, adapted specifically for Songhoy using LoRA (Low-Rank Adaptation). This efficient fine-tuning approach allowed us to achieve excellent results while maintaining the multilingual capabilities of the base model.
### Training Information
- **Base Model**: openai/whisper-large-v2
- **Fine-tuning Method**: LoRA (Parameter-Efficient Fine-Tuning)
- **Training Dataset**: [coming soon]
- **Training Duration**: 4 epochs
- **Batch Size**: 32 (8 per device with gradient accumulation steps of 4)
- **Learning Rate**: 0.001 with linear scheduler and 50 warmup steps
- **Mixed Precision**: Native AMP
### Training Results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.3661 | 1.0 | 245 | 0.3118 |
| 0.2712 | 2.0 | 490 | 0.2215 |
| 0.2008 | 3.0 | 735 | 0.2011 |
| 0.1518 | 3.9857 | 976 | 0.1897 |
## Real-World Applications
Songhoy-ASR-v1 enables numerous applications previously unavailable to Songhoy speakers:
- **Media Transcription**: Automatic subtitling of Songhoy content
- **Voice Interfaces**: Voice-controlled applications in Songhoy
- **Educational Tools**: Language learning and literacy applications
- **Cultural Preservation**: Documentation of oral histories and traditions
- **Healthcare Communication**: Improved access to health information
- **Accessibility Solutions**: Tools for the hearing impaired
## Usage Examples
```
Coming soon
```
## Limitations
[Coming Soon]
<!--
- Performance varies with different regional dialects of Songhoy
- Very specific technical terminology may have lower accuracy
- Extreme background noise can impact transcription quality
- Very young speakers or non-native speakers may have reduced accuracy
- Limited performance with extremely low-quality audio recordings -->
## Part of MALIBA-AI's African Language Initiative
Songhoy-ASR-v1 is part of MALIBA-AI's commitment to developing speech technology for all Malian languages. This model represents a significant step toward digital inclusion for Songhoy speakers and demonstrates the potential for high-quality AI systems for African languages.
Our mission of "No Malian Language Left Behind" drives us to develop technologies that:
- Preserve linguistic diversity
- Enable access to digital tools regardless of language
- Support local innovation and content creation
- Bridge the digital divide for all Malians
## Framework Versions
- PEFT 0.14.1.dev0
- Transformers 4.50.0.dev0
- PyTorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
## License
This model is released under the Apache 2.0 license.
## Citation
```bibtex
@misc{songhoy-asr-v1,
author = {MALIBA-AI},
title = {Songhoy-ASR-v1: Speech Recognition for Songhoy},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/MALIBA-AI/songhoy-asr-v1}}
}
```
---
**MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**
*"No Malian Language Left Behind"*