sivakorn-su
feat: add voice diarization project
78dde53
|
raw
history blame
1.92 kB
metadata
title: WhisperPyanoteLLM
emoji: πŸ“‰
colorFrom: indigo
colorTo: green
sdk: docker
pinned: false
license: apache-2.0

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

WhisperPyanoteLLM

A FastAPI-based app for speaker diarization and transcription using Whisper and PyAnnote, with LLM-powered summarization.

Features

  • Speaker diarization with pyannote.audio
  • Transcription with OpenAI Whisper
  • Summarization with Together LLM
  • REST API for video/audio upload and processing

Quick Start (Development)

  1. Clone the repository:

    git clone <your-repo-url>
    cd WhisperPyanoteLLM
    
  2. Create a .env file:

    HF_TOKEN=your_huggingface_token
    TOGETHER_API_KEY=your_together_api_key
    NGROK_AUTH_TOKEN=your_ngrok_token
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Run the app:

    uvicorn app:app --reload --port 8300
    
  5. Access the API:


Production (Docker)

  1. Create a .env.prod file:

    HF_TOKEN=your_huggingface_token
    TOGETHER_API_KEY=your_together_api_key
    NGROK_AUTH_TOKEN=your_ngrok_token
    
  2. Build the Docker image:

    docker build -t whisperpyanote .
    
  3. Run the Docker container:

    docker run --env-file .env.prod -p 8300:8300 whisperpyanote
    
  4. Access the API:


Notes

  • Make sure your .env and .env.prod files are not committed to version control.
  • For best performance, run on a machine with a CUDA-enabled GPU.
  • For more details, see the code and comments in app.py.

License

Apache-2.0