---
title: "CTC-DRO XLSR-based ASR model - set 4"
language: multilingual
tags:
  - asr
  - ctc-dro
  - XLSR
license: cc-by-nc-4.0
---

# CTC-Baseline XLSR-based ASR model - set 4

This repository contains a CTC-Baseline XLSR-based automatic speech recognition (ASR) model trained with ESPnet.  
The model was trained on balanced training data from set 4.

## Intended Use

This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`):
```bash
import soundfile as sf
from espnet2.bin.asr_inference import Speech2Text

asr_train_config = "ctc-baseline_xlsr_set_4/config.yaml"
asr_model_file = "ctc-baseline_xlsr_set_4/valid.loss.best.pth"

model = Speech2Text.from_pretrained(
    asr_train_config=asr_train_config,
    asr_model_file=asr_model_file
)

speech, _ = sf.read("input.wav")
text, *_ = model(speech)[0]

print("Recognized text:", text)
```

## How to Use

1. Clone this repository.
2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file.
3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths.