YAML Metadata Error: "datasets[0]" with value "https://github.com/egorsmkv/speech-recognition-uk" is not valid. If possible, use a dataset id from https://hf.co/datasets.

Online variant of pruned_transducer_stateless5 for Ukrainian: https://github.com/proger/icefall/tree/uk

Decoding demo using Sherpa: https://twitter.com/darkproger/status/1570733844114046976

Trained on pseudolabels generated by darkproger/pruned-transducer-stateless5-ukrainian-1 on the noisy 1200 hours training set. Common Voice data was used only for validation.

Tensorboard run

./pruned_transducer_stateless5/train.py \
  --world-size 2 \
  --num-epochs 31 \
  --start-epoch 1 \
  --full-libri 1 \
  --exp-dir pruned_transducer_stateless5/exp-uk-filtered2 \
  --max-duration 600 \
  --use-fp16 1 \
  --num-encoder-layers 18 \
  --dim-feedforward 1024 \
  --nhead 4 \
  --encoder-dim 256 \
  --decoder-dim 512 \
  --joiner-dim 512 \
  --bpe-model uk/data/lang_bpe_250/bpe.model \
  --causal-convolution True \
  --dynamic-chunk-training True
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Evaluation results