w2v-bert-uk `v2.1`

Community

Discord: https://bit.ly/discord-uds
Speech Recognition: https://t.me/speech_recognition_uk
Speech Synthesis: https://t.me/speech_synthesis_uk

Overview

This is a next model of https://huggingface.co/Yehor/w2v-bert-uk

Demo

Use https://huggingface.co/spaces/Yehor/w2v-bert-uk-v2.1-demo space to see how the model works with your audios.

Usage

# pip install -U torch soundfile transformers

import torch
import soundfile as sf
from transformers import AutoModelForCTC, Wav2Vec2BertProcessor

# Config
model_name = 'Yehor/w2v-bert-2.0-uk-v2.1'
device = 'cuda:1' # or cpu
sampling_rate = 16_000

# Load the model
asr_model = AutoModelForCTC.from_pretrained(model_name).to(device)
processor = Wav2Vec2BertProcessor.from_pretrained(model_name)

paths = [
  'sample1.wav',
]

# Extract audio
audio_inputs = []
for path in paths:
  audio_input, _ = sf.read(path)
  audio_inputs.append(audio_input)

# Transcribe the audio
inputs = processor(audio_inputs, sampling_rate=sampling_rate).input_features
features = torch.tensor(inputs).to(device)

with torch.inference_mode():
  logits = asr_model(features).logits

predicted_ids = torch.argmax(logits, dim=-1)
predictions = processor.batch_decode(predicted_ids)

# Log results
print('Predictions:')
print(predictions)

skypro1111
/

w2v-bert-2.0-uk-v2.1-lm-3ngram

w2v-bert-uk `v2.1`

Community

Overview

Demo

Usage

Model tree for skypro1111/w2v-bert-2.0-uk-v2.1-lm-3ngram

Dataset used to train skypro1111/w2v-bert-2.0-uk-v2.1-lm-3ngram

Evaluation results

w2v-bert-uk v2.1

Community

Overview

Demo

Usage

Model tree for skypro1111/w2v-bert-2.0-uk-v2.1-lm-3ngram

Dataset used to train skypro1111/w2v-bert-2.0-uk-v2.1-lm-3ngram

Evaluation results

w2v-bert-uk `v2.1`