Pathumma Whisper Medium (Th)
Model Description
Additional information is needed
Quickstart
You can transcribe audio files using the pipeline
class with the following code snippet:
import torch
from transformers import pipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
lang = "th"
task = "transcribe"
pipe = pipeline(
task="automatic-speech-recognition",
model="nectec/Pathumma-whisper-th-medium",
torch_dtype=torch_dtype,
device=device,
)
pipe.model.config.forced_decoder_ids = pipe.tokenizer.get_decoder_prompt_ids(language=lang, task=task)
text = pipe("audio_path.wav")["text"]
print(text)
Limitations and Future Work
Additional information is needed
Acknowledgements
We extend our appreciation to the research teams engaged in the creation of the open speech model, including AIResearch, BiodatLab, Looloo Technology, SCB 10X, and OpenAI. We would like to express our gratitude to Dr. Titipat Achakulwisut of BiodatLab for the evaluation pipeline. We express our gratitude to ThaiSC, or NSTDA Supercomputer Centre, for supplying the LANTA used for model training, fine-tuning, and evaluation.
Pathumma Audio Team
Pattara Tipaksorn, Wayupuk Sommuang, Kwanchiva Thangthai
Citation
@misc{tipaksorn2024PathummaWhisper,
title = { {Pathumma Whisper Medium (TH)} },
author = { Pattara Tipaksorn and Wayupuk Sommuang and Kwanchiva Thangthai },
url = { https://huggingface.co/nectec/Pathumma-whisper-th-medium },
publisher = { Hugging Face },
year = { 2024 },
}
- Downloads last month
- 158
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for nectec/Pathumma-whisper-th-medium
Base model
openai/whisper-medium