|
--- |
|
license: mit |
|
language: |
|
- de |
|
metrics: |
|
- bleu |
|
- wer |
|
base_model: |
|
- openai/whisper-large-v3-turbo |
|
pipeline_tag: automatic-speech-recognition |
|
library_name: transformers |
|
--- |
|
|
|
# SCRUBBED REPOSITORY |
|
# MODEL TAKEN DOWN |
|
|
|
Due to some datasets' licenses the model had to be taken down. |
|
|
|
# Whisper Large V3 Turbo (Swiss German Fine-Tuned with QLoRa) |
|
|
|
This repository contains a fine-tuned version of OpenAI's Whisper Large V3 Turbo model, adapted specifically for Swiss German dialects using QLoRa optimization. The model achieves state-of-the-art performance for Swiss German automatic speech recognition (ASR). |
|
|
|
## Model Summary |
|
|
|
- **Base Model**: Whisper Large V3 Turbo |
|
- **Fine-Tuning Method**: QLoRa (8-bit precision) |
|
- **Rank**: 200 |
|
- **Alpha**: 16 |
|
- **Hardware**: 2x NVIDIA A100 80GB GPUs |
|
- **Training Time**: 140 hours |
|
|
|
## Performance Metrics |
|
|
|
- **Word Error Rate (WER)**: **17.5%** |
|
- **BLEU Score**: **65.0** |
|
|
|
The model's performance has been evaluated across multiple datasets representing diverse dialectal and demographic distributions in Swiss German. |
|
|
|
### Dataset Summary |
|
|
|
The model has been trained and evaluated on a comprehensive suite of Swiss German datasets: |
|
|
|
1. **SDS-200 Corpus** |
|
- **Size**: 200 hours |
|
- **Description**: A corpus covering all Swiss German dialects. |
|
|
|
2. **STT4SG-350** |
|
- **Size**: 343 hours |
|
- **Description**: Balanced distribution across Swiss German dialects and demographics, including gender representation. |
|
- **[Dataset Link](https://swissnlp.org/home/activities/datasets/)** |
|
|
|
3. **SwissDial-Zh v1.1** |
|
- **Size**: 24 hours |
|
- **Description**: A dataset with balanced representation of Swiss German dialects. |
|
- **[Dataset Link](https://mtc.ethz.ch/publications/open-source/swiss-dial.html)** |
|
|
|
4. **Swiss Parliament Corpus V2 (SPC)** |
|
- **Size**: 293 hours |
|
- **Description**: Parliament recordings across Swiss German dialects. |
|
- **[Dataset Link](https://www.cs.technik.fhnw.ch/i4ds-datasets)** |
|
|
|
5. **ASGDTS (All Swiss German Dialects Test Set)** |
|
- **Size**: 13 hours |
|
- **Description**: A stratified dataset closely resembling real-world Swiss German dialect distribution. |
|
- **[Dataset Link](https://www.cs.technik.fhnw.ch/i4ds-datasets)** |
|
|
|
## Results Across Datasets |
|
|
|
### WER Scores |
|
|
|
| **Model** | **WER (All)** | **WER SD (All)** | |
|
|---------------------------|----------------|--------------------| |
|
| Turbo V3 Swiss German | **0.1672** | **0.1754** | |
|
| Large V3 | 0.2884 | 0.2829 | |
|
| Turbo V3 | 0.4392 | 0.2777 | |
|
|
|
|
|
### BLEU Scores |
|
|
|
| **Model** | **BLEU (All)** | **BLEU SD (All)** | |
|
|---------------------------|----------------|--------------------| |
|
| Turbo V3 Swiss German | **0.65** | **0.3149** | |
|
| Large V3 | 0.5345 | 0.3453 | |
|
| Turbo V3 | 0.3367 | 0.2975 | |
|
|
|
|
|
## Visual Results |
|
|
|
### WER and BLEU Scores Across Datasets |
|
|
|
 |
|
|
|
### WER Scores Across Datasets |
|
|
|
 |
|
|
|
### BLEU Scores Across Datasets |
|
|
|
 |
|
|
|
## Usage |
|
|
|
This model can be used directly with the Hugging Face Transformers library for tasks requiring Swiss German ASR. |
|
|
|
## Acknowledgments |
|
|
|
Special thanks to the creators and maintainers of the datasets used in this work: |
|
- [Swiss NLP](https://swissnlp.org/home/activities/datasets/) |
|
- [ETH Zurich](https://mtc.ethz.ch/publications/open-source/swiss-dial.html) |
|
- [FHNW](https://www.cs.technik.fhnw.ch/i4ds-datasets) |
|
|
|
And to the [University of Geneva](https://unige.ch) for allowing us access to their High Performance Computing cluster on which the model has been trained. |
|
|
|
## Citation |
|
|
|
If you use this model in your work, please cite this repository as follows: |
|
|
|
```bibtex |
|
@misc{whisper-large-v3-turbo-swissgerman, |
|
author = {Nizar Michaud}, |
|
title = {Whisper Large V3 Turbo Fine-Tuned for Swiss German}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
url = {https://huggingface.co/nizarmichaud/whisper-large-v3-turbo-swissgerman}, |
|
doi = 10.57967/hf/3858, |
|
} |