metadata
license: other
license_name: test
license_link: LICENSE
language:
- en
- fr
- de
- es
- pt
metrics:
- accuracy
- cer
pipeline_tag: automatic-speech-recognition
Model Card for Model ID
Preview-release for Fosdem 2025 with current training epochs (Training is still ongoing).
Overview
This is a family of low-latency streaming models designed for use on edge devices.
Goal: Provide faster or higher-quality performance compared to similarly sized Whisper and other models.
- Languages: English, French, German (Spanish and Portuguese planned for release by Feb 14).
Demos
- Browser Demo (CPU)
(Runs entirely in the browser using CPU.) - Gradio / Python Demo
License
The license is still under consideration (likely Coqui). The model is intended to be dual-licensed:
- Free for non-commercial use.
- Affordable license for commercial use.
Training
- Training is done with a modified k2/Icefall pipeline.
- Inference can be performed with the standard Sherpa project.
Acknowledgements
Special thanks to the Lhotse, Sherpa, k2, and Icefall teams for their support and tools.