Spaces:
Sleeping
Sleeping
Commit
·
8899889
0
Parent(s):
initial commit to origin
Browse files- .gitignore +43 -0
- Dockerfile +31 -0
- README.md +87 -0
- app.py +155 -0
- pyproject.toml +46 -0
- research_assistant/__init__.py +5 -0
- research_assistant/tools.py +27 -0
- uv.lock +0 -0
.gitignore
ADDED
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Virtual environment
|
2 |
+
.venv/
|
3 |
+
venv/
|
4 |
+
|
5 |
+
# Environment variables
|
6 |
+
.env
|
7 |
+
|
8 |
+
# Python
|
9 |
+
__pycache__/
|
10 |
+
*.py[cod]
|
11 |
+
*$py.class
|
12 |
+
*.so
|
13 |
+
.Python
|
14 |
+
build/
|
15 |
+
develop-eggs/
|
16 |
+
dist/
|
17 |
+
downloads/
|
18 |
+
eggs/
|
19 |
+
.eggs/
|
20 |
+
lib/
|
21 |
+
lib64/
|
22 |
+
parts/
|
23 |
+
sdist/
|
24 |
+
var/
|
25 |
+
wheels/
|
26 |
+
*.egg-info/
|
27 |
+
.installed.cfg
|
28 |
+
*.egg
|
29 |
+
|
30 |
+
# IDE
|
31 |
+
.idea/
|
32 |
+
.vscode/
|
33 |
+
*.swp
|
34 |
+
*.swo
|
35 |
+
|
36 |
+
# Logs
|
37 |
+
*.log
|
38 |
+
logs/
|
39 |
+
chainlit.md
|
40 |
+
.chainlit/*
|
41 |
+
|
42 |
+
**/.DS_Store
|
43 |
+
.DS_Store
|
Dockerfile
ADDED
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# Get a distribution that has uv already installed
|
3 |
+
FROM ghcr.io/astral-sh/uv:python3.13-bookworm-slim
|
4 |
+
|
5 |
+
# Add user - this is the user that will run the app
|
6 |
+
# If you do not set user, the app will run as root (undesirable)
|
7 |
+
RUN useradd -m -u 1000 user
|
8 |
+
USER user
|
9 |
+
|
10 |
+
# Set the home directory and path
|
11 |
+
ENV HOME=/home/user \
|
12 |
+
PATH=/home/user/.local/bin:$PATH
|
13 |
+
|
14 |
+
ENV UVICORN_WS_PROTOCOL=websockets
|
15 |
+
|
16 |
+
|
17 |
+
# Set the working directory
|
18 |
+
WORKDIR $HOME/app
|
19 |
+
|
20 |
+
# Copy the app to the container
|
21 |
+
COPY --chown=user . $HOME/app
|
22 |
+
|
23 |
+
# Install the dependencies
|
24 |
+
# RUN uv sync --frozen
|
25 |
+
RUN uv sync
|
26 |
+
|
27 |
+
# Expose the port
|
28 |
+
EXPOSE 7860
|
29 |
+
|
30 |
+
# Run the app
|
31 |
+
CMD ["uv", "run", "chainlit", "run", "app.py", "--host", "0.0.0.0", "--port", "7860"]
|
README.md
ADDED
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Research Assistant
|
2 |
+
|
3 |
+
A powerful research assistant that combines Wikipedia, Reddit, and Semantic Scholar using LangGraph and Chainlit.
|
4 |
+
|
5 |
+
## Prerequisites
|
6 |
+
|
7 |
+
- Python 3.9 or higher
|
8 |
+
- `uv` package manager (install with `curl -LsSf https://astral.sh/uv/install.sh | sh`)
|
9 |
+
|
10 |
+
## Setup
|
11 |
+
|
12 |
+
1. Clone the repository:
|
13 |
+
```bash
|
14 |
+
git clone <repository-url>
|
15 |
+
cd research-assistant
|
16 |
+
```
|
17 |
+
|
18 |
+
2. Create and activate virtual environment:
|
19 |
+
```bash
|
20 |
+
uv venv
|
21 |
+
source .venv/bin/activate # On Unix/macOS
|
22 |
+
# or
|
23 |
+
.venv\Scripts\activate # On Windows
|
24 |
+
```
|
25 |
+
|
26 |
+
3. Install dependencies:
|
27 |
+
```bash
|
28 |
+
# Install all dependencies (including dev dependencies)
|
29 |
+
uv sync --all
|
30 |
+
|
31 |
+
# Or, install only production dependencies
|
32 |
+
uv sync
|
33 |
+
```
|
34 |
+
|
35 |
+
4. Configure your environment:
|
36 |
+
```bash
|
37 |
+
# Copy the environment template
|
38 |
+
cp .env.template .env
|
39 |
+
|
40 |
+
# Edit .env with your API keys
|
41 |
+
OPENAI_API_KEY=your_openai_api_key
|
42 |
+
REDDIT_CLIENT_ID=your_reddit_client_id
|
43 |
+
REDDIT_CLIENT_SECRET=your_reddit_client_secret
|
44 |
+
```
|
45 |
+
|
46 |
+
## Development
|
47 |
+
|
48 |
+
The project uses modern Python development tools:
|
49 |
+
- `ruff` for linting
|
50 |
+
- `black` for code formatting
|
51 |
+
- `mypy` for type checking
|
52 |
+
|
53 |
+
To run the development tools:
|
54 |
+
```bash
|
55 |
+
# Format code
|
56 |
+
black .
|
57 |
+
|
58 |
+
# Lint code
|
59 |
+
ruff check .
|
60 |
+
|
61 |
+
# Type check
|
62 |
+
mypy .
|
63 |
+
```
|
64 |
+
|
65 |
+
## Running the Application
|
66 |
+
|
67 |
+
1. Activate the virtual environment (if not already activated):
|
68 |
+
```bash
|
69 |
+
source .venv/bin/activate # On Unix/macOS
|
70 |
+
# or
|
71 |
+
.venv\Scripts\activate # On Windows
|
72 |
+
```
|
73 |
+
|
74 |
+
2. Start the Chainlit app:
|
75 |
+
```bash
|
76 |
+
chainlit run app.py
|
77 |
+
```
|
78 |
+
|
79 |
+
The application will be available at `http://localhost:8000`
|
80 |
+
|
81 |
+
## Project Structure
|
82 |
+
|
83 |
+
- `app.py`: Main application with LangGraph implementation
|
84 |
+
- `tools.py`: Tool implementations (Wikipedia, Reddit, Semantic Scholar)
|
85 |
+
- `chainlit.md`: Chainlit welcome message
|
86 |
+
- `pyproject.toml`: Project metadata and dependency specifications
|
87 |
+
- `.env.template`: Template for environment variables
|
app.py
ADDED
@@ -0,0 +1,155 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from typing import List, Dict, TypedDict, Union, Annotated
|
2 |
+
import chainlit as cl
|
3 |
+
from langgraph.graph import StateGraph, END
|
4 |
+
from langgraph.graph.message import add_messages
|
5 |
+
from langgraph.prebuilt import ToolNode
|
6 |
+
from langchain_openai import ChatOpenAI
|
7 |
+
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
|
8 |
+
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
9 |
+
from langchain_core.tools import BaseTool
|
10 |
+
|
11 |
+
from operator import itemgetter
|
12 |
+
from pydantic import BaseModel, Field, ConfigDict
|
13 |
+
from research_assistant.tools import tools
|
14 |
+
|
15 |
+
|
16 |
+
import json
|
17 |
+
from dotenv import load_dotenv
|
18 |
+
|
19 |
+
load_dotenv()
|
20 |
+
|
21 |
+
|
22 |
+
# Types for our nodes
|
23 |
+
class AgentState(TypedDict):
|
24 |
+
"""State for the research agent."""
|
25 |
+
messages: Annotated[list, add_messages]
|
26 |
+
|
27 |
+
|
28 |
+
# Initialize the LLM
|
29 |
+
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0, streaming=True)
|
30 |
+
|
31 |
+
# bind tools to the llm
|
32 |
+
llm = llm.bind_tools(tools)
|
33 |
+
|
34 |
+
|
35 |
+
# Agent node implementation
|
36 |
+
async def call_model(state: AgentState) -> Dict:
|
37 |
+
"""Agent node that decides which tool to use."""
|
38 |
+
print("...........................................Calling agent model...........................................")
|
39 |
+
print(f"State:: {state}\n\n")
|
40 |
+
response = llm.invoke(state["messages"])
|
41 |
+
return {"messages": [response]}
|
42 |
+
|
43 |
+
|
44 |
+
execute_tool = ToolNode(tools)
|
45 |
+
|
46 |
+
# Create the graph
|
47 |
+
uncompiled_graph = StateGraph(AgentState)
|
48 |
+
|
49 |
+
# Add nodes
|
50 |
+
uncompiled_graph.add_node("agent", call_model)
|
51 |
+
uncompiled_graph.add_node("action", execute_tool)
|
52 |
+
|
53 |
+
|
54 |
+
# conditional edge function
|
55 |
+
def should_continue(state):
|
56 |
+
last_message = state["messages"][-1]
|
57 |
+
|
58 |
+
if last_message.tool_calls:
|
59 |
+
return "action"
|
60 |
+
|
61 |
+
return END
|
62 |
+
|
63 |
+
|
64 |
+
# Add edges
|
65 |
+
uncompiled_graph.add_conditional_edges("agent", should_continue)
|
66 |
+
uncompiled_graph.add_edge("action", "agent")
|
67 |
+
|
68 |
+
# Set entry point
|
69 |
+
uncompiled_graph.set_entry_point("agent")
|
70 |
+
|
71 |
+
# Compile the graph
|
72 |
+
compiled_graph = uncompiled_graph.compile()
|
73 |
+
|
74 |
+
|
75 |
+
@cl.on_chat_start
|
76 |
+
async def start():
|
77 |
+
"""Initialize the chat session."""
|
78 |
+
# Initialize session state
|
79 |
+
initial_state = AgentState(
|
80 |
+
messages=[SystemMessage(content="You are a helpful research assistant. Only answer the last question.")],
|
81 |
+
)
|
82 |
+
|
83 |
+
cl.user_session.set("state", initial_state)
|
84 |
+
|
85 |
+
await cl.Message(
|
86 |
+
content="""👋 Hello! I'm your research assistant. I can help you find information from:
|
87 |
+
|
88 |
+
- 📚 Wikipedia
|
89 |
+
- 💬 Reddit discussions
|
90 |
+
- 📖 Academic papers (Semantic Scholar)
|
91 |
+
|
92 |
+
What would you like to know about?"""
|
93 |
+
).send()
|
94 |
+
|
95 |
+
|
96 |
+
@cl.on_message
|
97 |
+
async def main(message: cl.Message):
|
98 |
+
"""Handle incoming messages."""
|
99 |
+
# Get current session state
|
100 |
+
state_dict = cl.user_session.get("state")
|
101 |
+
state = AgentState(**state_dict)
|
102 |
+
|
103 |
+
# Update messages in state
|
104 |
+
state["messages"].append(HumanMessage(content=message.content))
|
105 |
+
inputs = {"messages": state["messages"]}
|
106 |
+
# try:
|
107 |
+
msg = cl.Message(content="")
|
108 |
+
# Run the graph with current state
|
109 |
+
async for chunk in compiled_graph.astream(inputs, stream_mode="updates"):
|
110 |
+
for node, values in chunk.items():
|
111 |
+
|
112 |
+
print(f"-------------- Receiving update from node: '{node}' --------------")
|
113 |
+
await msg.stream_token(f"Receiving update from node: **{node}**\n")
|
114 |
+
if node == "action":
|
115 |
+
for tool_msg in values["messages"]:
|
116 |
+
output = f"Tool used: {tool_msg.name}"
|
117 |
+
# output += f"\nTool output: {tool_msg.content}"
|
118 |
+
print(output)
|
119 |
+
await msg.stream_token(f"{output}\n\n")
|
120 |
+
else: # node == "agent"
|
121 |
+
if values["messages"][0].tool_calls:
|
122 |
+
tool_names = [tool["name"] for tool in values["messages"][0].tool_calls]
|
123 |
+
output = f"Tool(s) Selected: {', '.join(tool_names)}"
|
124 |
+
print(output)
|
125 |
+
await msg.stream_token(f"{output}\n\n")
|
126 |
+
else:
|
127 |
+
# output = f"\n\n\n**Final Model output**: {values['messages'][-1].content}"
|
128 |
+
output = "\n**Final output**\n"
|
129 |
+
print(output)
|
130 |
+
print(values["messages"][-1].content)
|
131 |
+
await msg.stream_token(f"{output}")
|
132 |
+
# await msg.stream_token(values["messages"][-1].content)
|
133 |
+
print("\n\n")
|
134 |
+
|
135 |
+
# stream messages to the UI
|
136 |
+
if token := values["messages"][-1].content:
|
137 |
+
await msg.stream_token(token)
|
138 |
+
|
139 |
+
# Update messages in state
|
140 |
+
# state["messages"].extend(values["messages"])
|
141 |
+
# msg = cl.Message(content=values["messages"][-1].content)
|
142 |
+
# await message.send()
|
143 |
+
|
144 |
+
|
145 |
+
# Update session state
|
146 |
+
cl.user_session.set("state", state)
|
147 |
+
|
148 |
+
|
149 |
+
# except Exception as e:
|
150 |
+
# await cl.Message(
|
151 |
+
# content=f"""❌ An error occurred:
|
152 |
+
# ```python
|
153 |
+
# {str(e)}
|
154 |
+
# ```"""
|
155 |
+
# ).send()
|
pyproject.toml
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
[build-system]
|
2 |
+
requires = ["setuptools>=69.0.0", "wheel"]
|
3 |
+
build-backend = "setuptools.build_meta"
|
4 |
+
|
5 |
+
[project]
|
6 |
+
name = "research-assistant"
|
7 |
+
version = "0.1.0"
|
8 |
+
description = "A research assistant powered by LangGraph and Chainlit"
|
9 |
+
requires-python = ">=3.9,<3.12"
|
10 |
+
readme = "README.md"
|
11 |
+
license = { text = "MIT" }
|
12 |
+
dependencies = [
|
13 |
+
"chainlit~=2.0.4",
|
14 |
+
"langgraph~=0.2.67",
|
15 |
+
"langchain~=0.3.15",
|
16 |
+
"langchain-community~=0.3.16",
|
17 |
+
"langchain-openai~=0.3.2",
|
18 |
+
"wikipedia~=1.4.0",
|
19 |
+
"praw~=7.8.1",
|
20 |
+
"semanticscholar~=0.9.0",
|
21 |
+
"python-dotenv~=1.0.1",
|
22 |
+
"websockets>=14.2",
|
23 |
+
]
|
24 |
+
|
25 |
+
[project.optional-dependencies]
|
26 |
+
dev = [
|
27 |
+
"ruff~=0.3.3",
|
28 |
+
"black~=24.2.0",
|
29 |
+
"mypy~=1.9.0",
|
30 |
+
]
|
31 |
+
|
32 |
+
[tool.setuptools]
|
33 |
+
packages = ["research_assistant"]
|
34 |
+
|
35 |
+
[tool.ruff]
|
36 |
+
select = ["E", "F", "I", "N", "W", "B"]
|
37 |
+
line-length = 100
|
38 |
+
|
39 |
+
[tool.black]
|
40 |
+
line-length = 100
|
41 |
+
target-version = ["py39"]
|
42 |
+
|
43 |
+
[tool.mypy]
|
44 |
+
python_version = "3.9"
|
45 |
+
strict = true
|
46 |
+
ignore_missing_imports = true
|
research_assistant/__init__.py
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""Research Assistant package for information retrieval from multiple sources."""
|
2 |
+
|
3 |
+
from .tools import tools
|
4 |
+
|
5 |
+
__all__ = ["tools"]
|
research_assistant/tools.py
ADDED
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from dotenv import load_dotenv
|
2 |
+
import os
|
3 |
+
|
4 |
+
from langchain_community.tools.reddit_search.tool import RedditSearchRun
|
5 |
+
from langchain_community.utilities.reddit_search import RedditSearchAPIWrapper
|
6 |
+
from langchain_community.tools.semanticscholar.tool import SemanticScholarQueryRun
|
7 |
+
from langchain_community.tools import WikipediaQueryRun
|
8 |
+
from langchain_community.utilities import WikipediaAPIWrapper
|
9 |
+
|
10 |
+
load_dotenv()
|
11 |
+
|
12 |
+
wikipedia_tool = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
|
13 |
+
semantic_scholar_tool = SemanticScholarQueryRun()
|
14 |
+
reddit_tool = RedditSearchRun(
|
15 |
+
api_wrapper=RedditSearchAPIWrapper(
|
16 |
+
client_id=os.getenv("REDDIT_CLIENT_ID"),
|
17 |
+
client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
|
18 |
+
user_agent=os.getenv("REDDIT_USER_AGENT")
|
19 |
+
)
|
20 |
+
)
|
21 |
+
|
22 |
+
# Initialize tools
|
23 |
+
tools = [
|
24 |
+
wikipedia_tool,
|
25 |
+
reddit_tool,
|
26 |
+
semantic_scholar_tool,
|
27 |
+
]
|
uv.lock
ADDED
The diff for this file is too large to render.
See raw diff
|
|