A newer version of the Gradio SDK is available:
5.44.1
metadata
title: youtube-channel-surfer-ai
license: mit
emoji: 📺
app_file: app.py
sdk: gradio
pinned: false
python_version: 3.13
📺 YouTube Metadata Q&A Agent
This application allows you to index YouTube channels and ask natural language questions about the videos. It leverages OpenAI embeddings and GPT-4o-mini to provide insightful answers based on video metadata (titles + descriptions), and it displays top relevant videos in a clean, interactive table.
Features
- Index YouTube Channels: Provide one or more YouTube channel URLs to index video metadata.
- Search & Answer Questions: Ask questions about channel content and get answers generated by an LLM.
- Top Video Results: View top relevant videos in a structured HTML table with clickable links.
- Embedded Video Player: Watch videos directly in the app using YouTube embeds.
- Refresh Channels: Update previously indexed channels to include the latest videos.
- Lightweight Storage: Uses a local ChromaDB persistent database to store video embeddings for fast retrieval.
- Structured LLM Output: LLM returns structured
LLMAnswer
objects with textual answer + top videos for clean rendering.
How it Works
Channel Indexing:
- The app fetches the latest videos from provided YouTube channels using the YouTube Data API.
- Video metadata (title, description, channel, video ID) is embedded with OpenAI embeddings and stored in ChromaDB.
Query & Retrieval:
- User queries are embedded and compared with stored video embeddings.
- Top matching videos are retrieved.
Answer Generation:
- The LLM generates an answer based on the top video metadata.
- The answer and top videos are returned as structured data (
LLMAnswer
).
Rendering:
- Answer text is displayed in Markdown.
- Top videos are displayed in a structured HTML table with clickable links and embedded YouTube players.
Installation
Steps to Run
Clone the repository:
git clone <repo_url> cd youtube_surfer_ai_agent
Create and activate a virtual environment:
Linux/macOS:
python -m venv .venv source .venv/bin/activate
Windows:
python -m venv .venv .venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Create a
.env
file in the project root with your API keys:YOUTUBE_API_KEY=your_youtube_api_key OPENAI_API_KEY=your_openai_api_key
Run the application:
python app.py
Open the Gradio interface in your browser (default: http://127.0.0.1:7860).
How to Use
- Index Channels: Paste one or more YouTube channel URLs (comma or newline separated) and click "Index Channels".
- Refresh Channels: Use the sidebar "Refresh All Channels" button to update existing channels.
- Ask Questions: Type a query in the text box and click "Get Answer" to receive a structured response with embedded videos.
- View Indexed Channels: The sidebar lists all channels that have been indexed with clickable links.
Notes
- The LLM uses structured outputs (
LLMAnswer
+VideoItem
) internally to produce consistent results. - Top videos are embedded as iframes in the Gradio interface.
- You can adjust the number of top videos returned by modifying the
top_k
parameter inanswer_query
.