vikramvasudevan's picture
Upload folder using huggingface_hub
f315fdc verified

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: youtube-channel-surfer-ai
license: mit
emoji: 📺
app_file: app.py
sdk: gradio
pinned: false
python_version: 3.13

📺 YouTube Metadata Q&A Agent

This application allows you to index YouTube channels and ask natural language questions about the videos. It leverages OpenAI embeddings and GPT-4o-mini to provide insightful answers based on video metadata (titles + descriptions), and it displays top relevant videos in a clean, interactive table.


Features

  • Index YouTube Channels: Provide one or more YouTube channel URLs to index video metadata.
  • Search & Answer Questions: Ask questions about channel content and get answers generated by an LLM.
  • Top Video Results: View top relevant videos in a structured HTML table with clickable links.
  • Embedded Video Player: Watch videos directly in the app using YouTube embeds.
  • Refresh Channels: Update previously indexed channels to include the latest videos.
  • Lightweight Storage: Uses a local ChromaDB persistent database to store video embeddings for fast retrieval.
  • Structured LLM Output: LLM returns structured LLMAnswer objects with textual answer + top videos for clean rendering.

How it Works

  1. Channel Indexing:

    • The app fetches the latest videos from provided YouTube channels using the YouTube Data API.
    • Video metadata (title, description, channel, video ID) is embedded with OpenAI embeddings and stored in ChromaDB.
  2. Query & Retrieval:

    • User queries are embedded and compared with stored video embeddings.
    • Top matching videos are retrieved.
  3. Answer Generation:

    • The LLM generates an answer based on the top video metadata.
    • The answer and top videos are returned as structured data (LLMAnswer).
  4. Rendering:

    • Answer text is displayed in Markdown.
    • Top videos are displayed in a structured HTML table with clickable links and embedded YouTube players.

Installation

Steps to Run

  1. Clone the repository:

     git clone <repo_url>
     cd youtube_surfer_ai_agent
    
  2. Create and activate a virtual environment:

    • Linux/macOS:

        python -m venv .venv
        source .venv/bin/activate
      
    • Windows:

        python -m venv .venv
        .venv\Scripts\activate
      
  3. Install dependencies:

     pip install -r requirements.txt
    
  4. Create a .env file in the project root with your API keys:

     YOUTUBE_API_KEY=your_youtube_api_key
     OPENAI_API_KEY=your_openai_api_key
    
  5. Run the application:

     python app.py
    
  6. Open the Gradio interface in your browser (default: http://127.0.0.1:7860).


How to Use

  • Index Channels: Paste one or more YouTube channel URLs (comma or newline separated) and click "Index Channels".
  • Refresh Channels: Use the sidebar "Refresh All Channels" button to update existing channels.
  • Ask Questions: Type a query in the text box and click "Get Answer" to receive a structured response with embedded videos.
  • View Indexed Channels: The sidebar lists all channels that have been indexed with clickable links.

Notes

  • The LLM uses structured outputs (LLMAnswer + VideoItem) internally to produce consistent results.
  • Top videos are embedded as iframes in the Gradio interface.
  • You can adjust the number of top videos returned by modifying the top_k parameter in answer_query.