File size: 4,154 Bytes
44da614 10cf487 44da614 2ed4aae 878f8f5 44da614 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
license: mit
language:
- de
metrics:
- bleu
- wer
base_model:
- openai/whisper-large-v3-turbo
pipeline_tag: automatic-speech-recognition
library_name: transformers
---
# SCRUBBED REPOSITORY
# MODEL TAKEN DOWN
Due to some datasets' licenses the model had to be taken down.
# Whisper Large V3 Turbo (Swiss German Fine-Tuned with QLoRa)
This repository contains a fine-tuned version of OpenAI's Whisper Large V3 Turbo model, adapted specifically for Swiss German dialects using QLoRa optimization. The model achieves state-of-the-art performance for Swiss German automatic speech recognition (ASR).
## Model Summary
- **Base Model**: Whisper Large V3 Turbo
- **Fine-Tuning Method**: QLoRa (8-bit precision)
- **Rank**: 200
- **Alpha**: 16
- **Hardware**: 2x NVIDIA A100 80GB GPUs
- **Training Time**: 140 hours
## Performance Metrics
- **Word Error Rate (WER)**: **17.5%**
- **BLEU Score**: **65.0**
The model's performance has been evaluated across multiple datasets representing diverse dialectal and demographic distributions in Swiss German.
### Dataset Summary
The model has been trained and evaluated on a comprehensive suite of Swiss German datasets:
1. **SDS-200 Corpus**
- **Size**: 200 hours
- **Description**: A corpus covering all Swiss German dialects.
2. **STT4SG-350**
- **Size**: 343 hours
- **Description**: Balanced distribution across Swiss German dialects and demographics, including gender representation.
- **[Dataset Link](https://swissnlp.org/home/activities/datasets/)**
3. **SwissDial-Zh v1.1**
- **Size**: 24 hours
- **Description**: A dataset with balanced representation of Swiss German dialects.
- **[Dataset Link](https://mtc.ethz.ch/publications/open-source/swiss-dial.html)**
4. **Swiss Parliament Corpus V2 (SPC)**
- **Size**: 293 hours
- **Description**: Parliament recordings across Swiss German dialects.
- **[Dataset Link](https://www.cs.technik.fhnw.ch/i4ds-datasets)**
5. **ASGDTS (All Swiss German Dialects Test Set)**
- **Size**: 13 hours
- **Description**: A stratified dataset closely resembling real-world Swiss German dialect distribution.
- **[Dataset Link](https://www.cs.technik.fhnw.ch/i4ds-datasets)**
## Results Across Datasets
### WER Scores
| **Model** | **WER (All)** | **WER SD (All)** |
|---------------------------|----------------|--------------------|
| Turbo V3 Swiss German | **0.1672** | **0.1754** |
| Large V3 | 0.2884 | 0.2829 |
| Turbo V3 | 0.4392 | 0.2777 |
### BLEU Scores
| **Model** | **BLEU (All)** | **BLEU SD (All)** |
|---------------------------|----------------|--------------------|
| Turbo V3 Swiss German | **0.65** | **0.3149** |
| Large V3 | 0.5345 | 0.3453 |
| Turbo V3 | 0.3367 | 0.2975 |
## Visual Results
### WER and BLEU Scores Across Datasets

### WER Scores Across Datasets

### BLEU Scores Across Datasets

## Usage
This model can be used directly with the Hugging Face Transformers library for tasks requiring Swiss German ASR.
## Acknowledgments
Special thanks to the creators and maintainers of the datasets used in this work:
- [Swiss NLP](https://swissnlp.org/home/activities/datasets/)
- [ETH Zurich](https://mtc.ethz.ch/publications/open-source/swiss-dial.html)
- [FHNW](https://www.cs.technik.fhnw.ch/i4ds-datasets)
And to the [University of Geneva](https://unige.ch) for allowing us access to their High Performance Computing cluster on which the model has been trained.
## Citation
If you use this model in your work, please cite this repository as follows:
```bibtex
@misc{whisper-large-v3-turbo-swissgerman,
author = {Nizar Michaud},
title = {Whisper Large V3 Turbo Fine-Tuned for Swiss German},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/nizarmichaud/whisper-large-v3-turbo-swissgerman},
doi = 10.57967/hf/3858,
} |