Speech-to-Text (CPU/GPU)
- Model:
openai/whisper-tiny
(MIT) - Task: Transcribe short audio clips. Requires ffmpeg installed.
- Note: Here we just provide the resources for to run this models in the laptops we didn't develop this entire models we just use the open source models for the experiment this model is developed by OpenAI
Quick start (any project)
# 1) Create env
python -m venv venv && source .venv/bin/activate # Windows: ./venv/Scripts/activate
# 2) Install deps
pip install -r requirements.txt
# 3) Run
python main.py --help
Tip: If you have a GPU + CUDA, PyTorch will auto-use it. If not, everything runs on CPU (slower but works).
and while running the main.py code using command then only you the output Use: python main.py --audio sample.wav
FFmpeg Installation
- Download FFmpeg:
- Visit https://www.gyan.dev/ffmpeg/builds/ and download
ffmpeg-git-essentials.zip
. - Extract to
C:\ffmpeg
(or another folder, e.g.,C:\Users\jhaishna\Documents\ffmpeg
).
- Visit https://www.gyan.dev/ffmpeg/builds/ and download
- Add FFmpeg to System PATH:
- Right-click 'This PC' > Properties > Advanced system settings > Environment Variables.
- Under 'System Variables', find
Path
, click 'Edit', and addC:\ffmpeg\bin
(adjust if extracted elsewhere). - Save changes.
- Verify Installation:
Open CMD (or VS Code terminal) and run:
ffmpeg -version
Expected output:
ffmpeg version ...
.
- For VS Code PowerShell Terminal:
If
ffmpeg -version
fails in VS Code, add FFmpeg to the PowerShell PATH:$env:PATH += ";C:\ffmpeg\bin"
To persist, edit PowerShell profile:
notepad $PROFILE
Add:
$env:PATH += ";C:\ffmpeg\bin"
Save and restart the terminal.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for remiai3/Speech-to-Text_by_openai_whisper-tiny
Base model
openai/whisper-tiny