Spaces:
Sleeping
Sleeping
| title: π§ DeepπResearchπEvaluator | |
| emoji: π§ ππ | |
| colorFrom: red | |
| colorTo: purple | |
| sdk: streamlit | |
| sdk_version: 1.41.1 | |
| app_file: app.py | |
| pinned: true | |
| license: mit | |
| short_description: Deep Research Evaluator for Long Horizon Learning Tasks | |
| # π΅', 'πΆ', 'πΈ', 'πΉ', 'πΊ', 'π·', 'π₯', 'π» | |
| Deep Research Evaluator is a conceptual AI system designed to analyze and synthesize information from extensive research literature, such as arXiv papers, to learn about specific topics and generate code applicable to long-horizon tasks in AI. This involves understanding complex subjects, identifying relevant methodologies, and implementing solutions that require planning and execution over extended sequences. | |
| # Project Architecture | |
| - π **Root Folder** | |
| - **app.py** (π€ *Streamlit App*) | |
| - Main entry point for your Streamlit application. | |
| - **requirements.txt** (π *Dependencies*) | |
| - Lists all the Python packages needed to run the app. | |
| - π **mycomponent** (π§ *HTML Component*) | |
| - A subdirectory containing your custom Streamlit component code. | |
| - **\_\_init\_\_.py** (π *Python Init*) | |
| - Tells Python this folder is a module/package. | |
| - **index.html** (π *Custom HTML*) | |
| - Front-end HTML/JS/CSS for the custom component. | |
| ```mermaid | |
| flowchart TB | |
| A[π Root Folder] --> B[app.py π€<br>(Streamlit App)] | |
| A --> C[requirements.txt π<br>(Dependencies)] | |
| A --> D[π mycomponent π§<br>(HTML Component)] | |
| D --> E[__init__.py π<br>(Python Init)] | |
| D --> F[index.html π<br>(Custom HTML)] | |
| ``` | |
| --- | |
| **Usage Flow**: | |
| 1. You run `streamlit run app.py`. | |
| 2. **app.py** imports **mycomponent** to load the HTML from **index.html**. | |
| 3. **requirements.txt** ensures needed dependencies are installed. | |
| 4. The **\_\_init\_\_.py** file ensures the custom component folder is recognized as a Python package. | |
| **Notes**: | |
| - **app.py** hosts your Streamlit logic and references the **mycomponent**. | |
| - **index.html** supplies the interface for any front-end custom elements. | |
| - **requirements.txt** keeps the environment consistent. | |
| Features | |
| π― Core Configuration & Setup | |
| Configures the Streamlit page with title βπ²TalkingAIResearcherπβ, sets layout, sidebar states, and environment variables. | |
| π API Setup & Clients | |
| Loads and initializes OpenAI, Anthropic, and HuggingFace clients from environment variables and secrets. | |
| π Session State Management | |
| Manages conversation history, transcripts, file editing states, and model selections. | |
| π§ get_high_info_terms() | |
| Extracts top words/bigrams from a text by counting frequency and filtering out stop words. | |
| π·οΈ clean_text_for_filename() | |
| Sanitizes text for valid filenames by removing special characters, short/unhelpful words, and truncating length. | |
| π generate_filename() | |
| Creates an intelligent filename based on timestamps, high-info terms, and a snippet of the content (removing duplicates). | |
| πΎ create_file() | |
| Saves prompt + response content to a file, using generate_filename(). | |
| π get_download_link() | |
| Generates base64-encoded download links for .md, audio, or zip files for inline downloading. | |
| π€ clean_for_speech() | |
| Strips out line breaks, URLs, and symbols to create more readable text for TTS. | |
| ποΈ edge_tts_generate_audio() | |
| Asynchronously generates audio files (e.g., .mp3) using Edge TTS. | |
| π speak_with_edge_tts() | |
| A wrapper function for the async TTS call, allowing direct usage in synchronous code. | |
| π΅ play_and_download_audio() | |
| Embeds an audio player in Streamlit and provides a download link for that audio file. | |
| πΏ save_qa_with_audio() | |
| Stores Q&A content in a markdown file and generates TTS audio for the question + answer. | |
| π° parse_arxiv_refs() | |
| Parses the multi-line markdown references returned by the ArXiv RAG pipeline into structured paper objects. | |
| π create_paper_links_md() | |
| Builds a minimal markdown page with numbered links to each paperβs ArXiv URL. | |
| π create_paper_audio_files() | |
| Processes each parsed paper, generating TTS audio and embedding base64 download links. | |
| π display_papers() | |
| Shows papers in the main area with a scrolling marquee (via streamlit_marquee), plus expanders for details and audio. | |
| π display_papers_in_sidebar() | |
| Mirrors the paper listing in the sidebar with expanders, letting users quickly play or download paper audio. | |
| π display_file_history_in_sidebar() | |
| Enumerates all local .md, .mp3, .wav files in descending modification time, letting users preview and download them. | |
| π¦ create_zip_of_files() | |
| Bundles multiple files (markdown + audio) into a zip with an automatically shortened filename. | |
| π perform_ai_lookup() | |
| The main function to: | |
| Query Anthropic (Claude) | |
| Call an ArXiv RAG pipeline | |
| Generate Q&A audio | |
| Parse and render the resulting papers | |
| π§ process_voice_input() | |
| Receives user text/voice input, then calls perform_ai_lookup() to produce an audio summary and final Q&A file. | |
| π¬ main() | |
| Orchestrates the entire application flow: | |
| Renders tabs for Voice Input, Media Gallery, ArXiv search, and Editor | |
| Shows file history in the sidebar | |
| Manages marquee settings and final UI layout | |