--- license: other license_name: test license_link: LICENSE language: - en - fr - de - es - pt metrics: - accuracy - cer pipeline_tag: automatic-speech-recognition --- # Model Card for Model ID > **Preview-release for Fosdem 2025 with current training epochs (Training is still ongoing).** ## Overview This is a family of low-latency streaming models designed for use on edge devices. **Goal**: Provide faster or higher-quality performance compared to similarly sized Whisper and other models. - **Languages**: English, French, German (Spanish and Portuguese planned for release by **Feb 14**). ## Demos - [**Browser Demo (CPU)**](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm) *(Runs entirely in the browser using CPU.)* - [**Gradio / Python Demo**](https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python) ## License The license is still under consideration (likely Coqui). The model is intended to be **dual-licensed**: - **Free for non-commercial use**. - **Affordable license for commercial use**. ## Training - Training is done with a modified k2/Icefall pipeline. - Inference can be performed with the standard Sherpa project. ## Acknowledgements Special thanks to the [Lhotse](https://github.com/lhotse-speech/lhotse), [Sherpa](https://github.com/k2-fsa/sherpa), [k2](https://github.com/k2-fsa/k2), and [Icefall](https://github.com/k2-fsa/icefall) teams for their support and tools.