title: "Group-DRO MMS-based ASR model - set 1" | |
language: multilingual | |
tags: | |
- asr | |
- group-dro | |
- MMS | |
license: cc-by-nc-4.0 | |
# Group-DRO MMS-based ASR model - set 1 | |
This repository contains a Group-DRO MMS-based automatic speech recognition (ASR) model trained with ESPnet. | |
The model was trained on balanced training data from set 1. | |
## Intended Use | |
This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`): | |
```bash | |
import soundfile as sf | |
from espnet2.bin.asr_inference import Speech2Text | |
asr_train_config = "group-dro_mms_set_1/config.yaml" | |
asr_model_file = "group-dro_mms_set_1/valid.loss.best.pth" | |
model = Speech2Text.from_pretrained( | |
asr_train_config=asr_train_config, | |
asr_model_file=asr_model_file | |
) | |
speech, _ = sf.read("input.wav") | |
text, *_ = model(speech)[0] | |
print("Recognized text:", text) | |
``` | |
## How to Use | |
1. Clone this repository. | |
2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file. | |
3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths. | |