File size: 1,162 Bytes
e27265d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
title: "Group-DRO MMS-based ASR model - set 1"
language: multilingual
tags:
- asr
- group-dro
- MMS
license: cc-by-nc-4.0
---
# Group-DRO MMS-based ASR model - set 1
This repository contains a Group-DRO MMS-based automatic speech recognition (ASR) model trained with ESPnet.
The model was trained on balanced training data from set 1.
## Intended Use
This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`):
```bash
import soundfile as sf
from espnet2.bin.asr_inference import Speech2Text
asr_train_config = "group-dro_mms_set_1/config.yaml"
asr_model_file = "group-dro_mms_set_1/valid.loss.best.pth"
model = Speech2Text.from_pretrained(
asr_train_config=asr_train_config,
asr_model_file=asr_model_file
)
speech, _ = sf.read("input.wav")
text, *_ = model(speech)[0]
print("Recognized text:", text)
```
## How to Use
1. Clone this repository.
2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file.
3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths.
|