--- title: "CTC-DRO XLSR-based ASR model - set 4" language: multilingual tags: - asr - ctc-dro - XLSR license: cc-by-nc-4.0 --- # CTC-Baseline XLSR-based ASR model - set 4 This repository contains a CTC-Baseline XLSR-based automatic speech recognition (ASR) model trained with ESPnet. The model was trained on balanced training data from set 4. ## Intended Use This model is intended for ASR. Users can run inference using the provided checkpoint (`valid.loss.best.pth`) and configuration file (`config.yaml`): ```bash import soundfile as sf from espnet2.bin.asr_inference import Speech2Text asr_train_config = "ctc-baseline_xlsr_set_4/config.yaml" asr_model_file = "ctc-baseline_xlsr_set_4/valid.loss.best.pth" model = Speech2Text.from_pretrained( asr_train_config=asr_train_config, asr_model_file=asr_model_file ) speech, _ = sf.read("input.wav") text, *_ = model(speech)[0] print("Recognized text:", text) ``` ## How to Use 1. Clone this repository. 2. Use ESPnet’s inference scripts with the provided `config.yaml` and checkpoint file. 3. Ensure any external resources referenced in `config.yaml` are available at the indicated relative paths.