Upload 5 files

Browse files

Files changed (6) hide show

.gitattributes +1 -0
README.md +55 -0
main.py +13 -0
requirements.txt +12 -0
sample.mp3 +0 -0
sample.wav +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+sample.wav filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+# Speech-to-Text (CPU/GPU)
+- **Model:** `openai/whisper-tiny` (MIT)
+- **Task:** Transcribe short audio clips. Requires **ffmpeg** installed.
+- **Note:** Here we just provide the resources for to run this models in the laptops we didn't develop this entire models we just use the open source models for the experiment this model is developed by OpenAI
+## Quick start (any project)
+```bash
+# 1) Create env
+python -m venv venv && source .venv/bin/activate  # Windows: ./venv/Scripts/activate
+# 2) Install deps
+pip install -r requirements.txt
+# 3) Run
+python main.py --help
+```
+> Tip: If you have a GPU + CUDA, PyTorch will auto-use it. If not, everything runs on CPU (slower but works).
+---
+and while running the main.py code using command then only you the output
+**Use:** python main.py --audio sample.wav
+## FFmpeg Installation
+1. Download FFmpeg:
+   - Visit https://www.gyan.dev/ffmpeg/builds/ and download `ffmpeg-git-essentials.zip`.
+   - Extract to `C:\ffmpeg` (or another folder, e.g., `C:\Users\jhaishna\Documents\ffmpeg`).
+2. Add FFmpeg to System PATH:
+   - Right-click 'This PC' &gt; Properties &gt; Advanced system settings &gt; Environment Variables.
+   - Under 'System Variables', find `Path`, click 'Edit', and add `C:\ffmpeg\bin` (adjust if extracted elsewhere).
+   - Save changes.
+3. Verify Installation:
+   - Open CMD (or VS Code terminal) and run:
+     ```
+     ffmpeg -version
+     ```
+   - Expected output: `ffmpeg version ...`.
+4. For VS Code PowerShell Terminal:
+   - If `ffmpeg -version` fails in VS Code, add FFmpeg to the PowerShell PATH:
+     ```
+     $env:PATH += ";C:\ffmpeg\bin"
+     ```
+   - To persist, edit PowerShell profile:
+     ```
+     notepad $PROFILE
+     ```
+     Add: `$env:PATH += ";C:\ffmpeg\bin"`Save and restart the terminal.

main.py ADDED Viewed

	@@ -0,0 +1,13 @@

+import argparse, whisper
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--audio", type=str, required=True, help="Path to WAV/MP3/M4A file")
+    args = parser.parse_args()
+    model = whisper.load_model("tiny")  # auto-select device
+    result = model.transcribe(args.audio, language="en")
+    print(result["text"])
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,12 @@

+torch==2.1.0
+torchvision==0.16.0
+torchaudio==2.1.0
+transformers==4.38.2
+datasets==2.18.0
+Pillow==10.2.0
+numpy==1.26.4
+tqdm==4.66.2
+sentencepiece==0.1.99
+sentence-transformers==2.6.1
+easyocr==1.7.1
+openai-whisper

sample.mp3 ADDED Viewed

Binary file (17.2 kB). View file

sample.wav ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c38478da3500c9d88f981eed088dd1f06e2128cf9afa8d8aade14e271704b98
+size 137166