File size: 12,874 Bytes
d9e62f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "url = \"http://export.arxiv.org/api/query?search_query=all:deep+learning&start=0&max_results=5\"\n",
    "response = requests.get(url)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import xml.etree.ElementTree as ET\n",
    "\n",
    "# Parse the XML response\n",
    "root = ET.fromstring(response.text)\n",
    "\n",
    "# Extract titles and abstracts\n",
    "abstracts = []  # List to store abstracts\n",
    "titles = []  # List to store titles\n",
    "\n",
    "for entry in root.findall(\"{http://www.w3.org/2005/Atom}entry\"):\n",
    "    title = entry.find(\"{http://www.w3.org/2005/Atom}title\").text.strip()\n",
    "    abstract = entry.find(\"{http://www.w3.org/2005/Atom}summary\").text.strip()\n",
    "\n",
    "    titles.append(title)\n",
    "    abstracts.append(abstract)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/nitin/Documents/AI Agent project/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n",
      "Device set to use mps:0\n"
     ]
    }
   ],
   "source": [
    "from transformers import pipeline\n",
    "# Initialize summarization pipeline\n",
    "summarizer = pipeline(\"summarization\", model=\"facebook/bart-large-cnn\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Title: Opening the black box of deep learning\n",
      "Summary: The great success of deep learning shows that its technology contains aprofound truth. Understanding its internal mechanism not only has important implications for the development of its technology and effective application in various fields. At present, most of the theoretical research on\n",
      "--------------------------------------------------------------------------------\n",
      "Title: Concept-Oriented Deep Learning\n",
      "Summary: Concepts are the foundation of human deep learning, understanding, and knowledge integration and transfer. We propose concept-oriented deep learning(CODL) which extends (machine) deep learning with concept representations. CODL addresses some\n",
      "--------------------------------------------------------------------------------\n",
      "Title: Deep learning research landscape & roadmap in a nutshell: past, present\n",
      "  and future -- Towards deep cortical learning\n",
      "Summary: The past, present and future of deep learning is presented in this work. We predict that deep cortical learning will be the convergence of deeplearning & cortical learning which builds an artificial cortical column.\n",
      "--------------------------------------------------------------------------------\n",
      "Title: A First Look at Deep Learning Apps on Smartphones\n",
      "Summary: First empirical study on 16,500 popular Android apps. Demystifies how smartphone apps exploit deep learning in the wild. Findings paint promising picture of deep learning for smartphones.\n",
      "--------------------------------------------------------------------------------\n",
      "Title: Geometrization of deep networks for the interpretability of deep\n",
      "  learning systems\n",
      "Summary: Geometrization is a bridge to connect physics, geometry, deep networkand quantum computation. This may result in a new scheme to reveal the rule of the physical world. It may also help to solve theinterpretability problem of deep\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "# Summarize abstracts\n",
    "summaries = [summarizer(abs, max_length=50, min_length=25, truncation=True)[0][\"summary_text\"] for abs in abstracts]\n",
    "\n",
    "# Print results\n",
    "for title, summary in zip(titles, summaries):\n",
    "    print(f\"Title: {title}\\nSummary: {summary}\\n\" + \"-\"*80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/var/folders/cv/01jc2kcx1455j48m7hhwcvqw0000gn/T/ipykernel_17287/3038899353.py:4: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEmbeddings``.\n",
      "  embedding_model = HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-MiniLM-L6-v2\")\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Top 1 Result:\n",
      "We are in the dawn of deep learning explosion for smartphones. To bridge the\n",
      "gap between research and practice, we present the first empirical study on\n",
      "16,500 the most popular Android apps, demystifying how smartphone apps exploit\n",
      "deep learning in the wild. To this end, we build a new static tool that\n",
      "dissects apps and analyzes their deep learning functions. Our study answers\n",
      "threefold questions: what are the early adopter apps of deep learning, what do\n",
      "they use deep learning for, and how do their deep learning models look like.\n",
      "Our study has strong implications for app developers, smartphone vendors, and\n",
      "deep learning R\\&D. On one hand, our findings paint a promising picture of deep\n",
      "learning for smartphones, showing the prosperity of mobile deep learning\n",
      "frameworks as well as the prosperity of apps building their cores atop deep\n",
      "learning. On the other hand, our findings urge optimizations on deep learning\n",
      "models deployed on smartphones, the protection of these models, and validation\n",
      "of research ideas on these models.\n",
      "--------------------------------------------------------------------------------\n",
      "Top 2 Result:\n",
      "The past, present and future of deep learning is presented in this work.\n",
      "Given this landscape & roadmap, we predict that deep cortical learning will be\n",
      "the convergence of deep learning & cortical learning which builds an artificial\n",
      "cortical column ultimately.\n",
      "--------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings import HuggingFaceEmbeddings\n",
    "# Initialize HuggingFace Embeddings\n",
    "embedding_model = HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-MiniLM-L6-v2\")\n",
    "\n",
    "# Create FAISS Vector Store\n",
    "vector_store = FAISS.from_texts(abstracts, embedding_model)\n",
    "\n",
    "# Perform a similarity search (RAG Retrieval Step)\n",
    "query = \"best deep learning advancements\"\n",
    "top_docs = vector_store.similarity_search(query, k=2)\n",
    "\n",
    "# Display Retrieved Documents\n",
    "for i, doc in enumerate(top_docs):\n",
    "    print(f\"Top {i+1} Result:\\n{doc.page_content}\\n\" + \"-\"*80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Device set to use mps:0\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Agent: Time to flex my research muscles!\n",
      "Top papers: ['First empirical study on 16,500 popular Android apps. Demystifies how smartphone apps exploit deep learning in the wild. Findings', 'The past, present and future of deep learning is presented in this work. We predict that deep cortical learning will be the convergence of deep']\n",
      "Best pick: First empirical study on 16,500 popular Android apps. Demystifies how smartphone apps exploit deep learning in the wild. Findings\n"
     ]
    }
   ],
   "source": [
    "# Prerequisites: pip install langchain faiss-cpu transformers requests\n",
    "import requests\n",
    "from langchain.vectorstores import FAISS\n",
    "from langchain.embeddings import HuggingFaceEmbeddings\n",
    "from transformers import pipeline\n",
    "\n",
    "# Step 1: Define the Agent class (our Research Buddy)\n",
    "class ResearchBuddyAgent:\n",
    "    def __init__(self):\n",
    "        # Load embeddings for RAG\n",
    "        self.embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
    "        # Load summarizer (small model for speed)\n",
    "        self.summarizer = pipeline(\"summarization\", model=\"facebook/bart-large-cnn\")\n",
    "        self.papers = []  # Store fetched papers\n",
    "\n",
    "    # Step 2: Fetch papers (simple arXiv API call)\n",
    "    def fetch_papers(self, topic):\n",
    "        url = f\"http://export.arxiv.org/api/query?search_query=all:{topic}&start=0&max_results=5\"\n",
    "        response = requests.get(url)\n",
    "        # Parse the XML response\n",
    "        root = ET.fromstring(response.text)\n",
    "\n",
    "        # Extract titles and abstracts\n",
    "        abstracts = []  # List to store abstracts\n",
    "        titles = []  # List to store titles\n",
    "\n",
    "        for entry in root.findall(\"{http://www.w3.org/2005/Atom}entry\"):\n",
    "            title = entry.find(\"{http://www.w3.org/2005/Atom}title\").text.strip()\n",
    "            abstract = entry.find(\"{http://www.w3.org/2005/Atom}summary\").text.strip()\n",
    "\n",
    "            titles.append(title)\n",
    "            abstracts.append(abstract)\n",
    "\n",
    "    # Step 3: Build RAG vector store and retrieve\n",
    "    def retrieve_relevant(self, query):\n",
    "        # Create FAISS vector store from papers\n",
    "        vector_store = FAISS.from_texts(abstracts, embedding_model)\n",
    "        # Search for top matches\n",
    "        top_docs = vector_store.similarity_search(query, k=2)\n",
    "        return [doc.page_content for doc in top_docs]  # Extract text\n",
    "\n",
    "    # Step 4: Summarize papers\n",
    "    def summarize_papers(self, papers):\n",
    "        summaries = []\n",
    "        for paper in papers:\n",
    "            summary = self.summarizer(paper, max_length=30, min_length=15)[0][\"summary_text\"]\n",
    "            summaries.append(summary)\n",
    "        return summaries\n",
    "\n",
    "    # Step 5: Decide the \"best\" (simple rule: shortest summary wins)\n",
    "    def pick_best(self, summaries):\n",
    "        best = min(summaries, key=len)  # Lazy rule for demo\n",
    "        return best\n",
    "\n",
    "    # Step 6: Main agent flow (plan and execute)\n",
    "    def run(self, topic, query):\n",
    "        print(\"Agent: Time to flex my research muscles!\")\n",
    "        # Plan: Fetch β†’ Retrieve β†’ Summarize β†’ Pick\n",
    "        self.fetch_papers(topic)\n",
    "        relevant_papers = self.retrieve_relevant(query)\n",
    "        summaries = self.summarize_papers(relevant_papers)\n",
    "        best_summary = self.pick_best(summaries)\n",
    "        print(f\"Top papers: {summaries}\")\n",
    "        print(f\"Best pick: {best_summary}\")\n",
    "\n",
    "# Run the agent\n",
    "if __name__ == \"__main__\":\n",
    "    agent = ResearchBuddyAgent()\n",
    "    agent.run(topic=\"deep learning\", query=\"best deep learning advancements\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}