whisper-hf-rslora

This model is a fine-tuned version of openai/whisper-large-v3-turbo on compulsion/heart-failure-audio. It achieves the following results on the evaluation set:

  • Loss: 0.6919
  • Wer: 0.2424

Model description

A PEFT rank-stablized LoRA adapter of whisper-large-v3-turbo finetuned on heart failure audio data that is conversational, longitudinal, and focused on chronic illness management and care coordination in a community-based healthcare setting.

Intended uses & limitations

To be used in ASR tasks specifically in the heart failure domain.

Benchmark (base whisper-large-v3-turbo vs. finetuned rank-stablized LoRA adapter)

Normalized for PHI redactions and throught Transformer's BasicTextNormalizer.

Model Raw WER (%) Normalised WER (%)
Baseline 35.00 26.71
rsLoRA 26.18 20.71

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.3062 1.0 92 1.1343 0.2388
1.0317 2.0 184 0.7145 0.2620
0.6833 3.0 276 0.6606 0.2105
0.5934 4.0 368 0.6292 0.2122
0.5104 5.0 460 0.6347 0.2521
0.4392 6.0 552 0.6444 0.2729
0.3653 7.0 644 0.6701 0.2198
0.3178 8.0 736 0.6919 0.2424

Framework versions

  • PEFT 0.15.2
  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for compulsi0n/whisper-hf-rslora

Adapter
(74)
this model

Dataset used to train compulsi0n/whisper-hf-rslora