~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ README: GAIA Agent for Hugging Face Spaces ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Overview * ----------- The GAIAgent is built for Hugging Face Spaces to tackle questions and files from the GAIA dataset. Powered by GAIAProcessor in agent.py, it handles diverse file formats and question types with advanced tools for robust answers. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Structure * ------------ - agent.py: Core logic for GAIAProcessor. Uses LangGraph to manage web search (Wikipedia, Arxiv, DuckDuckGo, targeted scraping), file processing, and answer generation with ChatOllama (qwen2:7b, llama3:8b). - app.py: Gradio interface for HF Spaces. Integrates GAIAgent to fetch questions, process them, and submit answers to the GAIA API. - Dependencies: Listed in requirements.txt, includes gradio, pandas, requests, retrying==1.3.4, langchain_ollama, pydub, faster-whisper, sentence-transformers, faiss-cpu, ollama, shazamio, langchain-community, pdfplumber, PyPDF2, python-docx, python-pptx, pytesseract. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Functionality * --------------- - File Processing: * Parses PDF, XLSX, CSV, TXT, JSON, JSONL, PNG, JPG, DOCX, PPTX, MP3, M4A, XML. * Extracts text with pdfplumber, PyPDF2, pytesseract (OCR for images), docx, pptx. * Handles MP3: transcribes via faster-whisper, recognizes songs with shazamio, measures duration using pydub. - Question Handling: * Analyzes questions to trigger actions (file parsing, web search). * Supports ASCII art, card games, crosswords, dice games, addresses, song recognition. * Uses RAG with sentence-transformers and faiss for MP3 audiobooks. - Web Search: * Queries Wikipedia, Arxiv, DuckDuckGo, and sites like US Census, Macrotrends, X, museums. * Scrapes content with BeautifulSoup and requests. - Answer Generation: * Combines file content, web results, and LLM (qwen2:7b, llama3:8b) for precise answers. * Validates formats (numbers, addresses) and filters irrelevant content. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Workflow * ----------- 1. Input: Receives task_id, question, and optional file_path via Gradio or API. 2. Processing: * web_search: Gathers web data. * analyze_question: Processes file and question. * create_answer: Generates answer using LLM and context. 3. Output: Returns answer for GAIA API submission. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Setup * -------- 1. Clone the repository. 2. Install dependencies: pip install -r requirements.txt 3. Ensure Ollama runs with qwen2:7b and llama3:8b models. 4. Run locally: python app.py (Gradio UI) or deploy on HF Spaces. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Postscript * ------------- This implementation for HF Spaces was developed without clear platform specifics, as the course lacks guidance on HF Spaces integration. It doesn't cover how to test question and file processing via an external REST API, nor provide examples of feeding questions and file paths through a REST client to verify functionality. Thus, the solution is adapted for HF Spaces but lacks full certainty of seamless operation. Dear organizers, please address this significant gap in course material, which challenges users new to HF Spaces. However, the code is thoroughly tested locally. The Windows local testing script, used to generate answers, is in local_test_for_windows.py. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~