T5-based Audio Transcription Fusion Model
This model combines transcriptions from multiple sources separated by '/' to generate an optimal transcription. It is fine-tuned on a dataset where each sample has three candidate transcriptions and a reference transcription.
Training Details
Model trained on 21000 samples for 10 epochs with T5-small as the base model.
Training Loss: 0.005756139289587736
Evaluation Details
Test Loss: 0.011949276849159604 Word Error Rate (WER): 0.10040761999833625
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.