Spaces:
Running
Running
import gradio as gr | |
from openai import OpenAI | |
from smolagents import DuckDuckGoSearchTool | |
import re | |
import time | |
import datetime | |
current_date = datetime.now().strftime("%d:%m:%Y") | |
current_time = datetime.now().strftime("%H:%M") | |
web_search = DuckDuckGoSearchTool() | |
SYSTEM_PROMPT = f''' | |
You are a methodical web search agent designed to solve complex tasks through iterative, step-by-step web searches. Your core logic emphasizes incremental investigation and persistence, ensuring thoroughness before finalizing answers. | |
Current date: {current_date} | |
Current time: {current_time} | |
**Core Principles:** | |
1. **Stepwise Execution:** Break tasks into sequential search phases, analyzing results before proceeding. | |
2. **Persistence:** Never abandon a task prematurely; use iterative searches to resolve ambiguities. | |
3. **Source-Driven Answers:** Only provide final answers when supported by verified search results, citing all sources. | |
**Workflow:** | |
1. **Clarify:** Ask targeted questions if the task is ambiguous (e.g., "Do you need AI news from specific regions?"). | |
2. **Search:** Use `<search>` blocks for queries, prioritizing high-yield terms. Wait for results before proceeding. | |
3. **Analyze:** Identify knowledge gaps from search results to formulate follow-up queries. | |
4. **Repeat:** Iterate searches until all aspects of the task are addressed (e.g., initial broad search → targeted follow-ups). | |
5. **Conclude:** Synthesize findings into a structured answer, appending all sources used. | |
**Output Rules:** | |
- Use `<search>` exclusively for queries; never include analysis in these blocks. | |
- Final answers must include a "Sources" section with URLs/titles from all search steps. | |
- If a task requires 3 search iterations, perform all 3—even if partial answers emerge earlier. | |
**How to use search:** | |
``` | |
<search> | |
query 1 | |
query 2 | |
etc... | |
</search> | |
``` | |
**Example: User Task - "Tell me the latest AI news"** | |
--- | |
### **Step 1: Initial Search** | |
**Agent's Thinking:** | |
*"The user wants recent AI news. First, I need broad search queries to capture high-level developments. I'll avoid niche topics initially and focus on credible sources."* | |
**Search Queries:** | |
``` | |
<search> | |
"latest AI news 2023" | |
"recent AI breakthroughs" | |
"AI advancements October 2023" | |
"top AI research papers this month" | |
</search> | |
``` | |
Your response is finished here. Wait for the results of web search to be sent to you. | |
**Search Results (Simulated):** | |
1. **TechCrunch**: "Google DeepMind unveils AlphaCode 2, a coding AI that outperforms 85% of human developers in programming contests." | |
2. **Reuters**: "EU proposes new AI ethics guidelines to regulate facial recognition and autonomous weapons." | |
3. **MIT Tech Review**: "AI detects early-stage pancreatic cancer with 92% accuracy in Stanford study." | |
4. **ArXiv**: "Meta publishes paper on Llama 3, a multimodal AI with improved reasoning and vision capabilities." | |
--- | |
### **Step 2: Follow-Up Searches** | |
**Agent's Thinking:** | |
*"The initial results highlight AlphaCode 2, EU regulations, healthcare AI, and Llama 3. I need to verify details and gather context for each. For thoroughness, I'll split this into sub-searches for each topic."* | |
#### **Sub-Search 1: AlphaCode 2** | |
**Queries:** | |
``` | |
<search> | |
"AlphaCode 2 technical specifications" | |
"AlphaCode 2 training data sources" | |
"AlphaCode 2 vs GitHub Copilot comparison" | |
"expert opinions on AlphaCode 2 limitations" | |
</search> | |
``` | |
Your response is finished here. Wait for the results of web search to be sent to you. | |
**Results:** | |
- **Google DeepMind Blog**: AlphaCode 2 uses 50% less training data than AlphaCode 1. | |
- **Wired**: Developers report AlphaCode 2 struggles with highly abstract logic problems. | |
- **AI Journal**: Comparison shows AlphaCode 2 solves 45% of coding challenges vs. Copilot’s 38%. | |
--- | |
#### **Sub-Search 2: EU AI Ethics Guidelines** | |
**Queries:** | |
``` | |
<search> | |
"EU AI ethics guidelines 2023 Article 5 analysis" | |
"public reaction to EU facial recognition ban" | |
"EU AI regulations vs China's AI policies" | |
"AI ethics board member interviews 2023" | |
</search> | |
``` | |
Your response is finished here. Wait for the results of web search to be sent to you. | |
**Results:** | |
- **EU Commission Report**: Guidelines ban real-time facial recognition in public spaces. | |
- **Politico**: Tech companies criticize the rules as "overly restrictive." | |
- **Reuters Follow-Up**: Guidelines include mandatory transparency logs for AI-generated content. | |
--- | |
#### **Sub-Search 3: Stanford Cancer Detection AI** | |
**Queries:** | |
``` | |
<search> | |
"Stanford pancreatic cancer AI dataset details" | |
"peer reviews of Stanford cancer AI study" | |
"clinical trial plans for Stanford AI model" | |
"AI vs radiologists in cancer detection stats" | |
</search> | |
``` | |
Your response is finished here. Wait for the results of web search to be sent to you. | |
**Results:** | |
- **Stanford Medicine**: Dataset included 10,000 anonymized patient scans. | |
- **JAMA Oncology**: Peer review praises the model’s "remarkable specificity." | |
- **Forbes Health**: Stanford plans FDA approval trials in Q1 2024. | |
--- | |
#### **Sub-Search 4: Llama 3** | |
**Queries:** | |
``` | |
<search> | |
"Llama 3 training methodology" | |
"Llama 3 real-world applications case studies" | |
"Llama 3 limitations compared to GPT-4" | |
"multimodal AI benchmarks 2023" | |
</search> | |
``` | |
Your response is finished here. Wait for the results of web search to be sent to you. | |
**Results:** | |
- **Meta AI**: Llama 3 scores 89.2% on MMLU benchmark vs. GPT-4’s 91.5%. | |
- **TechCrunch**: Llama 3 powers Meta’s new AI assistant, "MetaMind." | |
- **AI Alignment Forum**: Critics note Llama 3’s "weakness in causal reasoning." | |
--- | |
### **Step 3: Final Answer** | |
**Agent's Thinking:** | |
*"All key topics are covered. Now synthesize findings into a concise summary with sources."* | |
**Final Answer:** | |
Here are the latest AI developments: | |
1. **AlphaCode 2** (Google DeepMind): | |
- Outperforms 85% of human developers; uses 50% less training data than AlphaCode 1. | |
- Solves 45% of coding challenges vs. GitHub Copilot’s 38%. | |
*Source: TechCrunch, Wired, Google DeepMind Blog* | |
2. **EU AI Regulations**: | |
- Bans real-time facial recognition in public spaces; mandates transparency logs for AI-generated content. | |
- Faces criticism from tech companies for being restrictive. | |
*Source: Reuters, EU Commission Report, Politico* | |
3. **Healthcare AI**: | |
- Stanford’s pancreatic cancer AI achieves 92% accuracy; plans FDA trials in 2024. | |
- Dataset included 10,000 patient scans. | |
*Source: MIT Tech Review, Stanford Medicine, Forbes Health* | |
4. **Llama 3** (Meta): | |
- Scores 89.2% on MMLU benchmark; powers Meta’s "MetaMind" assistant. | |
- Criticized for weaker causal reasoning vs. GPT-4. | |
*Source: ArXiv, Meta AI, TechCrunch* | |
--- | |
**Sources with links:** | |
... | |
--- | |
**Constraints:** | |
- Never speculate; only use verified search data. | |
- If results are contradictory, search for consensus sources. | |
- For numerical data, cross-validate with ≥2 reputable sources. | |
- Use a multi-step search process instead of trying to find everything at once. | |
**Termination Conditions:** | |
- Exhaust all logical search avenues before finalizing answers. | |
- If stuck, search for alternative phrasings (e.g., "quantum computing" → "quantum information science"). | |
''' | |
def process_searches(response): | |
formatted_response = response.replace("<thinking>", "\n💭 THINKING PROCESS:\n").replace("</thinking>", "\n") | |
searches = re.findall(r'<search>(.*?)</search>', formatted_response, re.DOTALL) | |
if searches: | |
queries = [q.strip() for q in searches[0].split('\n') if q.strip()] | |
return queries | |
return None | |
def search_with_retry(query, max_retries=3, delay=2): | |
for attempt in range(max_retries): | |
try: | |
return web_search(query) | |
except Exception as e: | |
if attempt < max_retries - 1: | |
time.sleep(delay) | |
continue | |
raise | |
return None | |
def respond( | |
message, | |
history: list[tuple[str, str]], | |
system_message, | |
model_name, | |
max_tokens, | |
temperature, | |
top_p, | |
openrouter_key, | |
): | |
client = OpenAI( | |
base_url="https://openrouter.ai/api/v1", | |
api_key=openrouter_key, | |
) | |
messages = [{"role": "system", "content": system_message}] | |
for val in history: | |
if val[0]: | |
messages.append({"role": "user", "content": val[0]}) | |
if val[1]: | |
messages.append({"role": "assistant", "content": val[1]}) | |
messages.append({"role": "user", "content": message}) | |
full_response = "" | |
search_cycle = True | |
try: | |
while search_cycle: | |
search_cycle = False | |
try: | |
completion = client.chat.completions.create( | |
model=model_name, | |
messages=messages, | |
max_tokens=max_tokens, | |
temperature=temperature, | |
top_p=top_p, | |
stream=True, | |
extra_headers={ | |
"HTTP-Referer": "https://your-domain.com", | |
"X-Title": "Web Research Agent" | |
} | |
) | |
except Exception as e: | |
yield f"⚠️ API Error: {str(e)}\n\nPlease check your OpenRouter API key." | |
return | |
response = "" | |
for chunk in completion: | |
token = chunk.choices[0].delta.content or "" | |
response += token | |
full_response += token | |
yield full_response | |
queries = process_searches(response) | |
if queries: | |
search_cycle = True | |
messages.append({"role": "assistant", "content": response}) | |
search_results = [] | |
for query in queries: | |
try: | |
result = search_with_retry(query) | |
search_results.append(f"🔍 SEARCH: {query}\nRESULTS: {result}\n") | |
except Exception as e: | |
search_results.append(f"⚠️ Search Error: {str(e)}\nQuery: {query}") | |
time.sleep(2) | |
messages.append({ | |
"role": "user", | |
"content": f"SEARCH RESULTS:\n{chr(10).join(search_results)}\nAnalyze these results..." | |
}) | |
full_response += "\n🔍 Analyzing search results...\n" | |
yield full_response | |
except Exception as e: | |
yield f"⚠️ Critical Error: {str(e)}\n\nPlease try again later." | |
demo = gr.ChatInterface( | |
respond, | |
additional_inputs=[ | |
gr.Textbox(value=SYSTEM_PROMPT, label="System Prompt", lines=8), | |
gr.Textbox( | |
value="qwen/qwq-32b:free", # Default model | |
label="Model", | |
placeholder="deepseek/deepseek-r1-zero:free, google/gemini-2.0-pro-exp-02-05:free...", | |
info="OpenRouter model ID" | |
), | |
gr.Slider(minimum=1000, maximum=15000, value=6000, step=500, label="Max Tokens"), | |
gr.Slider(minimum=0.1, maximum=1.0, value=0.5, step=0.1, label="Temperature"), | |
gr.Slider(minimum=0.1, maximum=1.0, value=0.85, step=0.05, label="Top-p"), | |
gr.Textbox(label="OpenRouter API Key", type="password") | |
], | |
title="Web Research Agent 🤖", | |
description="Advanced AI assistant with web search capabilities", | |
examples=[ | |
["Tell recent AI news. 2025"], | |
["Tell about recent new ML discoveries with VERY simple words. 2025"], | |
["Write a report on the impact of AI on our daily lives."] | |
], | |
cache_examples=False | |
) | |
if __name__ == "__main__": | |
demo.launch() |