writinwaters commited on
Commit
4e0a78d
·
1 Parent(s): ffd3989

Updated RESTful API Reference (#908)

Browse files

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Documentation Update

README.md CHANGED
@@ -20,7 +20,7 @@
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
- <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=1570EF" alt="license">
24
  </a>
25
  </p>
26
 
 
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
+ <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=2e6cc4" alt="license">
24
  </a>
25
  </p>
26
 
README_ja.md CHANGED
@@ -20,7 +20,7 @@
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
- <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=1570EF" alt="license">
24
  </a>
25
  </p>
26
 
 
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
+ <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=2e6cc4" alt="license">
24
  </a>
25
  </p>
26
 
README_zh.md CHANGED
@@ -20,7 +20,7 @@
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
- <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=1570EF" alt="license">
24
  </a>
25
  </p>
26
 
 
20
  <img src="https://img.shields.io/badge/docker_pull-ragflow:v0.6.0-brightgreen"
21
  alt="docker pull infiniflow/ragflow:v0.6.0"></a>
22
  <a href="https://github.com/infiniflow/ragflow/blob/main/LICENSE">
23
+ <img height="21" src="https://img.shields.io/badge/License-Apache--2.0-ffffff?style=flat-square&labelColor=d4eaf7&color=2e6cc4" alt="license">
24
  </a>
25
  </p>
26
 
docs/quickstart.md CHANGED
@@ -127,7 +127,7 @@ To add and configure an LLM:
127
 
128
  ![system model settings](https://github.com/infiniflow/ragflow/assets/93570324/cdcc1da5-4494-44cd-ad5b-1222ed6acc3f)
129
 
130
- > Some of the models, such as the image-to-text model **qwen-vl-max**, are subsidiary to a particular LLM. And you may need to update your API key accordingly to use these models.
131
 
132
  ## Create your first knowledge base
133
 
 
127
 
128
  ![system model settings](https://github.com/infiniflow/ragflow/assets/93570324/cdcc1da5-4494-44cd-ad5b-1222ed6acc3f)
129
 
130
+ > Some models, such as the image-to-text model **qwen-vl-max**, are subsidiary to a specific LLM. And you may need to update your API key to access these models.
131
 
132
  ## Create your first knowledge base
133
 
docs/references/_category_.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "label": "References",
3
- "position": 1,
4
  "link": {
5
  "type": "generated-index",
6
  "description": "RAGFlow References"
 
1
  {
2
  "label": "References",
3
+ "position": 3,
4
  "link": {
5
  "type": "generated-index",
6
  "description": "RAGFlow References"
docs/references/api.md CHANGED
@@ -5,7 +5,7 @@ slug: /api
5
 
6
  # API reference
7
 
8
- ![](https://github.com/infiniflow/ragflow/assets/12318111/df0dcc3d-789a-44f7-89f1-7a5f044ab729)
9
 
10
  ## Base URL
11
  ```
@@ -14,25 +14,47 @@ https://demo.ragflow.io/v1/
14
 
15
  ## Authorization
16
 
17
- All the APIs are authorized with API-Key. Please keep it safe and private. Don't reveal it in any way from the front-end.
18
- The API-Key should put in the header of request:
 
19
  ```buildoutcfg
20
  Authorization: Bearer {API_KEY}
21
  ```
22
 
23
- ## Start a conversation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- This should be called whenever there's new user coming to chat.
26
- ### Path: /api/new_conversation
27
- ### Method: GET
28
- ### Parameter:
29
 
30
- | name | type | optional | description|
31
- |------|-------|----|----|
32
- | user_id| string | No | It's for identifying user in order to search and calculate statistics.|
33
 
34
  ### Response
35
- ```json
 
 
 
36
  {
37
  "data": {
38
  "create_date": "Fri, 12 Apr 2024 17:26:21 GMT",
@@ -42,7 +64,7 @@ This should be called whenever there's new user coming to chat.
42
  "id": "b9b2e098f8ae11ee9f45fa163e197198",
43
  "message": [
44
  {
45
- "content": "Hi, I'm your assistant, can I help you?",
46
  "role": "assistant"
47
  }
48
  ],
@@ -50,20 +72,60 @@ This should be called whenever there's new user coming to chat.
50
  "tokens": 0,
51
  "update_date": "Fri, 12 Apr 2024 17:26:21 GMT",
52
  "update_time": 1712913981857,
53
- "user_id": "kevinhu"
54
  },
55
  "retcode": 0,
56
  "retmsg": "success"
57
  }
58
- ```
59
- > data['id'] in response should be stored and will be used in every round of following conversation.
60
 
61
- ## Get history of a conversation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
- ### Path: /api/conversation/\<id\>
64
- ### Method: GET
65
  ### Response
66
- ```json
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  {
68
  "data": {
69
  "create_date": "Mon, 01 Apr 2024 09:28:42 GMT",
@@ -92,7 +154,7 @@ This should be called whenever there's new user coming to chat.
92
  "role": "assistant"
93
  }
94
  ],
95
- "user_id": "user name",
96
  "reference": [
97
  {
98
  "chunks": [
@@ -101,7 +163,7 @@ This should be called whenever there's new user coming to chat.
101
  "content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
102
  "content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><th rowspan=2 >Metrics</th><th >GPT-3.5-turbo</th><th></th><th >Claude-2</th><th >GPT-4</th></tr>\n<tr><th >Zero</th><th >Few</th><th >Zero Few</th><th >Zero Few</th></tr>\n<tr><td >CLIP Score</td><td >0.0</td><td >0.0</td><td >0.0 0.2543</td><td >0.0 0.3055</td></tr>\n<tr><td >BERT Score</td><td >0.1914</td><td >0.3820</td><td >0.2111 0.5038</td><td >0.2076 0.6307</td></tr>\n<tr><td >ViT Score</td><td >0.2437</td><td >0.7497</td><td >0.4082 0.5416</td><td >0.5058 0.6480</td></tr>\n<tr><td >Overall</td><td >0.1450</td><td >0.3772</td><td >0.2064 0.4332</td><td >0.2378 0.5281</td></tr>\n</table>",
103
  "doc_id": "c790da40ea8911ee928e0242ac180005",
104
- "docnm_kwd": "OpenAGI When LLM Meets Domain Experts.pdf",
105
  "img_id": "afab9fdad6e511eebdb20242ac180006-d0bc7892c3ec4aeac071544fd56730a8",
106
  "important_kwd": [],
107
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
@@ -123,7 +185,7 @@ This should be called whenever there's new user coming to chat.
123
  "content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . here , onli the task descript of the benchmark task are fed into llm(addit inform , such a the input prompt and llm\u2019output , is provid in fig . a.4 and a.5 in supplementari). broadli speak , closed-sourc llm demonstr superior perform on openagi task , with gpt-4 lead the pack under both zero-and few-shot scenario . in the open-sourc categori , llama-2-13b take the lead , consist post top result across variou learn schema--the perform possibl influenc by it larger model size . notabl , open-sourc llm significantli benefit from the tune method , particularli fine-tun and\u2019rltf . these method mark notic enhanc for flan-t5-larg , vicuna-7b , and llama-2-13b when compar with zero-shot and few-shot learn schema . in fact , each of these open-sourc model hit it pinnacl under the rltf approach . conclus , with rltf tune , the perform of llama-2-13b approach that of gpt-3.5 , illustr it potenti .",
124
  "content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores. Here, only the task descriptions of the benchmark tasks are fed into LLMs (additional information, such as the input prompt and LLMs\u2019 outputs, is provided in Fig. A.4 and A.5 in supplementary). Broadly speaking, closed-source LLMs demonstrate superior performance on OpenAGI tasks, with GPT-4 leading the pack under both zero- and few-shot scenarios. In the open-source category, LLaMA-2-13B takes the lead, consistently posting top results across various learning schema--the performance possibly influenced by its larger model size. Notably, open-source LLMs significantly benefit from the tuning methods, particularly Fine-tuning and\u2019 RLTF. These methods mark noticeable enhancements for Flan-T5-Large, Vicuna-7B, and LLaMA-2-13B when compared with zero-shot and few-shot learning schema. In fact, each of these open-source models hits its pinnacle under the RLTF approach. Conclusively, with RLTF tuning, the performance of LLaMA-2-13B approaches that of GPT-3.5, illustrating its potential.",
125
  "doc_id": "c790da40ea8911ee928e0242ac180005",
126
- "docnm_kwd": "OpenAGI When LLM Meets Domain Experts.pdf",
127
  "img_id": "afab9fdad6e511eebdb20242ac180006-7e2345d440383b756670e1b0f43a7007",
128
  "important_kwd": [],
129
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
@@ -157,7 +219,7 @@ This should be called whenever there's new user coming to chat.
157
  "content_ltks": "nvlink bridg support nvidia\u00aenvlink\u00aei a high-spe point-to-point peer transfer connect , where one gpu can transfer data to and receiv data from one other gpu . the nvidia a100 card support nvlink bridg connect with a singl adjac a100 card . each of the three attach bridg span two pcie slot . to function correctli a well a to provid peak bridg bandwidth , bridg connect with an adjac a100 card must incorpor all three nvlink bridg . wherev an adjac pair of a100 card exist in the server , for best bridg perform and balanc bridg topolog , the a100 pair should be bridg . figur 4 illustr correct and incorrect a100 nvlink connect topolog . nvlink topolog\u2013top view figur 4. correct incorrect correct incorrect for system that featur multipl cpu , both a100 card of a bridg card pair should be within the same cpu domain\u2014that is , under the same cpu\u2019s topolog . ensur thi benefit workload applic perform . the onli except is for dual cpu system wherein each cpu ha a singl a100 pcie card under it;in that case , the two a100 pcie card in the system may be bridg togeth . a100 nvlink speed and bandwidth are given in the follow tabl . tabl 5. a100 nvlink speed and bandwidth paramet valu total nvlink bridg support by nvidia a100 3 total nvlink rx and tx lane support 96 data rate per nvidia a100 nvlink lane(each direct)50 gbp total maximum nvlink bandwidth 600 gbyte per second pb-10137-001_v03|8 nvidia a100 40gb pcie gpu acceler",
158
  "content_with_weight": "NVLink Bridge Support\nNVIDIA\u00aeNVLink\u00aeis a high-speed point-to-point peer transfer connection, where one GPU can transfer data to and receive data from one other GPU. The NVIDIA A100 card supports NVLink bridge connection with a single adjacent A100 card.\nEach of the three attached bridges spans two PCIe slots. To function correctly as well as to provide peak bridge bandwidth, bridge connection with an adjacent A100 card must incorporate all three NVLink bridges. Wherever an adjacent pair of A100 cards exists in the server, for best bridging performance and balanced bridge topology, the A100 pair should be bridged. Figure 4 illustrates correct and incorrect A100 NVLink connection topologies.\nNVLink Topology \u2013Top Views \nFigure 4. \nCORRECT \nINCORRECT \nCORRECT \nINCORRECT \nFor systems that feature multiple CPUs, both A100 cards of a bridged card pair should be within the same CPU domain\u2014that is, under the same CPU\u2019s topology. Ensuring this benefits workload application performance. The only exception is for dual CPU systems wherein each CPU has a single A100 PCIe card under it; in that case, the two A100 PCIe cards in the system may be bridged together.\nA100 NVLink speed and bandwidth are given in the following table.\n<table><caption>Table 5. A100 NVLink Speed and Bandwidth </caption>\n<tr><th >Parameter </th><th >Value </th></tr>\n<tr><td >Total NVLink bridges supported by NVIDIA A100 </td><td >3 </td></tr>\n<tr><td >Total NVLink Rx and Tx lanes supported </td><td >96 </td></tr>\n<tr><td >Data rate per NVIDIA A100 NVLink lane (each direction)</td><td >50 Gbps </td></tr>\n<tr><td >Total maximum NVLink bandwidth</td><td >600 Gbytes per second </td></tr>\n</table>\nPB-10137-001_v03 |8\nNVIDIA A100 40GB PCIe GPU Accelerator",
159
  "doc_id": "806d1ed0ea9311ee860a0242ac180005",
160
- "docnm_kwd": "A100-PCIE-Prduct-Brief.pdf",
161
  "img_id": "afab9fdad6e511eebdb20242ac180006-8c11a1edddb21ad2ae0c43b4a5dcfa62",
162
  "important_kwd": [],
163
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
@@ -191,45 +253,53 @@ This should be called whenever there's new user coming to chat.
191
  "retcode": 0,
192
  "retmsg": "success"
193
  }
194
- ```
 
195
 
196
- - **message**: All the chat history in it.
197
- - role: user or assistant
198
- - content: the text content of user or assistant. The citations are in format like: ##0$$. The number in the middle indicate which part in data.reference.chunks it refers to.
199
 
200
- - **user_id**: This is set by the caller.
201
- - **reference**: Every item in it refer to the corresponding message in data.message whose role is assistant.
202
- - chunks
203
- - content_with_weight: The content of chunk.
204
- - docnm_kwd: the document name.
205
- - img_id: the image id of the chunk. It is an optional field only for PDF/pptx/picture. And accessed by 'GET' /document/get/\<id\>.
206
- - positions: [page_number, [upleft corner(x, y)], [right bottom(x, y)]], the chunk position, only for PDF.
207
- - similarity: the hybrid similarity.
208
- - term_similarity: keyword simimlarity
209
- - vector_similarity: embedding similarity
210
- - doc_aggs:
211
- - doc_id: the document can be accessed by 'GET' /document/get/\<id\>
212
- - doc_name: the file name
213
- - count: the chunk number hit in this document.
214
-
215
- ## Chat
216
 
217
- This will be called to get the answer to users' questions.
218
 
219
- ### Path: /api/completion
220
- ### Method: POST
221
- ### Parameter:
222
 
223
- | name | type | optional | description|
224
- |------|-------|----|----|
225
- | conversation_id| string | No | This is from calling /new_conversation.|
226
- | messages| json | No | The latest question, such as `[{"role": "user", "content": "How are you doing!"}]`|
227
- | quote | bool | Yes | Default: true |
228
- | stream | bool | Yes | Default: true |
229
- | doc_ids | string | Yes | Document IDs which is delimited by comma, like `c790da40ea8911ee928e0242ac180005,c790da40ea8911ee928e0242ac180005`. The retrieved content is limited in these documents. |
 
 
230
 
231
  ### Response
232
- ```json
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
233
  {
234
  "data": {
235
  "answer": "The ViT Score for GPT-4 in the zero-shot scenario is 0.5058, and in the few-shot scenario, it is 0.6480. ##0$$",
@@ -240,7 +310,7 @@ This will be called to get the answer to users' questions.
240
  "content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
241
  "content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><th rowspan=2 >Metrics</th><th >GPT-3.5-turbo</th><th></th><th >Claude-2</th><th >GPT-4</th></tr>\n<tr><th >Zero</th><th >Few</th><th >Zero Few</th><th >Zero Few</th></tr>\n<tr><td >CLIP Score</td><td >0.0</td><td >0.0</td><td >0.0 0.2543</td><td >0.0 0.3055</td></tr>\n<tr><td >BERT Score</td><td >0.1914</td><td >0.3820</td><td >0.2111 0.5038</td><td >0.2076 0.6307</td></tr>\n<tr><td >ViT Score</td><td >0.2437</td><td >0.7497</td><td >0.4082 0.5416</td><td >0.5058 0.6480</td></tr>\n<tr><td >Overall</td><td >0.1450</td><td >0.3772</td><td >0.2064 0.4332</td><td >0.2378 0.5281</td></tr>\n</table>",
242
  "doc_id": "c790da40ea8911ee928e0242ac180005",
243
- "docnm_kwd": "OpenAGI When LLM Meets Domain Experts.pdf",
244
  "img_id": "afab9fdad6e511eebdb20242ac180006-d0bc7892c3ec4aeac071544fd56730a8",
245
  "important_kwd": [],
246
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
@@ -262,7 +332,7 @@ This will be called to get the answer to users' questions.
262
  "content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . here , onli the task descript of the benchmark task are fed into llm(addit inform , such a the input prompt and llm\u2019output , is provid in fig . a.4 and a.5 in supplementari). broadli speak , closed-sourc llm demonstr superior perform on openagi task , with gpt-4 lead the pack under both zero-and few-shot scenario . in the open-sourc categori , llama-2-13b take the lead , consist post top result across variou learn schema--the perform possibl influenc by it larger model size . notabl , open-sourc llm significantli benefit from the tune method , particularli fine-tun and\u2019rltf . these method mark notic enhanc for flan-t5-larg , vicuna-7b , and llama-2-13b when compar with zero-shot and few-shot learn schema . in fact , each of these open-sourc model hit it pinnacl under the rltf approach . conclus , with rltf tune , the perform of llama-2-13b approach that of gpt-3.5 , illustr it potenti .",
263
  "content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores. Here, only the task descriptions of the benchmark tasks are fed into LLMs (additional information, such as the input prompt and LLMs\u2019 outputs, is provided in Fig. A.4 and A.5 in supplementary). Broadly speaking, closed-source LLMs demonstrate superior performance on OpenAGI tasks, with GPT-4 leading the pack under both zero- and few-shot scenarios. In the open-source category, LLaMA-2-13B takes the lead, consistently posting top results across various learning schema--the performance possibly influenced by its larger model size. Notably, open-source LLMs significantly benefit from the tuning methods, particularly Fine-tuning and\u2019 RLTF. These methods mark noticeable enhancements for Flan-T5-Large, Vicuna-7B, and LLaMA-2-13B when compared with zero-shot and few-shot learning schema. In fact, each of these open-source models hits its pinnacle under the RLTF approach. Conclusively, with RLTF tuning, the performance of LLaMA-2-13B approaches that of GPT-3.5, illustrating its potential.",
264
  "doc_id": "c790da40ea8911ee928e0242ac180005",
265
- "docnm_kwd": "OpenAGI When LLM Meets Domain Experts.pdf",
266
  "img_id": "afab9fdad6e511eebdb20242ac180006-7e2345d440383b756670e1b0f43a7007",
267
  "important_kwd": [],
268
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
@@ -289,46 +359,50 @@ This will be called to get the answer to users' questions.
289
  "retcode": 0,
290
  "retmsg": "success"
291
  }
292
- ```
 
293
 
294
- - **answer**: The replay of the chat bot.
295
- - **reference**:
296
- - chunks: Every item in it refer to the corresponding message in answer.
297
- - content_with_weight: The content of chunk.
298
- - docnm_kwd: the document name.
299
- - img_id: the image id of the chunk. It is an optional field only for PDF/pptx/picture. And accessed by 'GET' /document/get/\<id\>.
300
- - positions: [page_number, [upleft corner(x, y)], [right bottom(x, y)]], the chunk position, only for PDF.
301
- - similarity: the hybrid similarity.
302
- - term_similarity: keyword simimlarity
303
- - vector_similarity: embedding similarity
304
- - doc_aggs:
305
- - doc_id: the document can be accessed by 'GET' /document/get/\<id\>
306
- - doc_name: the file name
307
- - count: the chunk number hit in this document.
308
-
309
  ## Get document content or image
310
 
311
- This is usually used when display content of citation.
312
- ### Path: /api/document/get/\<id\>
313
- ### Method: GET
 
 
 
 
 
 
 
 
314
 
315
  ## Upload file
316
 
317
- This is usually used when upload a file to.
318
- ### Path: /api/document/upload/
319
- ### Method: POST
320
 
321
- ### Parameter:
 
 
 
 
 
 
 
 
 
 
 
 
 
322
 
323
- | name | type | optional | description |
324
- |-----------|--------|----------|---------------------------------------------------------|
325
- | file | file | No | Upload file. |
326
- | kb_name | string | No | Choose the upload knowledge base name. |
327
- | parser_id | string | Yes | Choose the parsing method. |
328
- | run | string | Yes | Parsing will start automatically when the value is "1". |
329
 
330
  ### Response
331
- ```json
 
 
 
332
  {
333
  "data": {
334
  "chunk_num": 0,
@@ -368,24 +442,34 @@ This is usually used when upload a file to.
368
  "retmsg": "success"
369
  }
370
 
371
- ```
 
372
 
373
  ## Get document chunks
374
 
375
- Get the chunks of the document based on doc_name or doc_id.
376
- ### Path: /api/list_chunks/
377
- ### Method: POST
378
 
379
- ### Parameter:
380
 
381
- | Name | Type | Optional | Description |
382
- |----------|--------|----------|---------------------------------|
383
- | `doc_name` | string | Yes | The name of the document in the knowledge base. It must not be empty if `doc_id` is not set.|
384
- | `doc_id` | string | Yes | The ID of the document in the knowledge base. It must not be empty if `doc_name` is not set.|
385
 
 
 
 
386
 
387
- ### Response
388
- ```json
 
 
 
 
 
 
 
 
 
 
 
389
  {
390
  "data": [
391
  {
@@ -394,7 +478,7 @@ Get the chunks of the document based on doc_name or doc_id.
394
  "img_id": "0335167613f011ef91240242ac120006-b46c3524952f82dbe061ce9b123f2211"
395
  },
396
  {
397
- "content": "4.3 ProcessingOverheadof RL-CacheACKNOWLEDGMENTSThis section evaluates how e￿ectively our RL-Cache implemen-tation leverages modern multi-core CPUs and GPUs to keep the per-request neural-net processing overhead low. Figure 14 depictsThis researchwas supported inpart by the Regional Government of Madrid (grant P2018/TCS-4499, EdgeData-CM)andU.S. National Science Foundation (grants CNS-1763617 andCNS-1717179).REFERENCES",
398
  "doc_name": "RL-Cache.pdf",
399
  "img_id": "0335167613f011ef91240242ac120006-d4c12c43938eb55d2d8278eea0d7e6d7"
400
  }
@@ -403,28 +487,38 @@ Get the chunks of the document based on doc_name or doc_id.
403
  "retmsg": "success"
404
  }
405
 
406
- ```
 
407
 
408
- ## Get document list from knowledge base
409
 
410
- Get document list based on the knowledge base name and corresponding parameters.
411
- ### Path: /api/list_kb_docs/
412
- ### Method: POST
413
 
414
- ### Parameter:
415
 
416
- | Name | Type | Optional | Description |
417
- |-------------|--------|----------|----------------------------------------------------------------------|
418
- | `kb_name` | string | No | The name of the knowledge base, from which you get the document list. |
419
- | `page` | int | Yes | The number of pages, default:1. |
420
- | `page_size` | int | Yes | The number of docs for each page, default:15. |
421
- | `orderby` | string | Yes | `chunk_num`, `create_time`, or `size`, default:`create_time` |
422
- | `desc` | bool | Yes | Default:True. |
423
- | `keywords` | string | Yes | Keyword of the document name. |
 
 
 
 
 
 
 
 
424
 
425
 
426
  ### Response
427
- ```json
 
 
 
428
  {
429
  "data": {
430
  "docs": [
@@ -443,28 +537,39 @@ Get document list based on the knowledge base name and corresponding parameters.
443
  "retmsg": "success"
444
  }
445
 
446
- ```
 
 
 
 
 
 
 
447
 
448
- ## Delete document
449
 
450
- Delete document by document id or document name.
451
- ### Path: /api/document/rm/
452
- ### Method: POST
453
 
454
- ### Parameter:
455
 
456
- | Name | Type | Optional | Description |
457
  |-------------|--------|----------|----------------------------|
458
- | `doc_names` | List | Yes | The list of document name. |
459
- | `doc_ids` | List | Yes | The list of document id. |
460
 
461
 
462
- ### Response
463
- ```json
 
 
 
464
  {
465
  "data": true,
466
  "retcode": 0,
467
  "retmsg": "success"
468
  }
469
 
470
- ```
 
 
5
 
6
  # API reference
7
 
8
+ RAGFlow offers RESTful APIs for you to integrate its capabilities into third-party applications.
9
 
10
  ## Base URL
11
  ```
 
14
 
15
  ## Authorization
16
 
17
+ All of RAGFlow's RESTFul APIs use API key for authorization, so keep it safe and do not expose it to the front end.
18
+ Put your API key in the request header.
19
+
20
  ```buildoutcfg
21
  Authorization: Bearer {API_KEY}
22
  ```
23
 
24
+ To get your API key:
25
+
26
+ 1. In RAGFlow, click **Chat** tab in the middle top of the page.
27
+ 2. Hover over the corresponding dialogue **>** **Chat Bot API** to show the chatbot API configuration page.
28
+ 3. Click **Api Key** **>** **Create new key** to create your API key.
29
+ 4. Copy and keep your API key safe.
30
+
31
+ ## Create conversation
32
+
33
+ This method creates (news) a conversation for a specific user.
34
+
35
+ ### Request
36
+
37
+ #### Request URI
38
+
39
+ | Method | Request URI |
40
+ |----------|-------------------------------------------------------------|
41
+ | GET | `/api/new_conversation` |
42
+
43
+ :::note
44
+ You are *required* to save the `data.id` value returned in the response data, which is the session ID for all upcoming conversations.
45
+ :::
46
 
47
+ #### Request parameter
 
 
 
48
 
49
+ | Name | Type | Required | Description |
50
+ |----------|--------|----------|-------------------------------------------------------------|
51
+ | `user_id`| string | Yes | The unique identifier assigned to each user. `user_id` must be less than 32 characters and cannot be empty. The following character sets are supported: <br />- 26 lowercase English letters (a-z)<br />- 26 uppercase English letters (A-Z)<br />- 10 digits (0-9)<br />- "_", "-", "." |
52
 
53
  ### Response
54
+
55
+ <details>
56
+ <summary>Response example</summary>
57
+ <pre><code>
58
  {
59
  "data": {
60
  "create_date": "Fri, 12 Apr 2024 17:26:21 GMT",
 
64
  "id": "b9b2e098f8ae11ee9f45fa163e197198",
65
  "message": [
66
  {
67
+ "content": "Hi, I'm your assistant, what can I do for you?",
68
  "role": "assistant"
69
  }
70
  ],
 
72
  "tokens": 0,
73
  "update_date": "Fri, 12 Apr 2024 17:26:21 GMT",
74
  "update_time": 1712913981857,
75
+ "user_id": "<USER_ID_SET_BY_THE_CALLER>"
76
  },
77
  "retcode": 0,
78
  "retmsg": "success"
79
  }
 
 
80
 
81
+ </code></pre>
82
+ </details>
83
+
84
+ ## Get conversation history
85
+
86
+ This method retrieves the history of a specified conversation session.
87
+
88
+ ### Request
89
+
90
+ #### Request URI
91
+
92
+ | Method | Request URI |
93
+ |----------|-------------------------------------------------------------|
94
+ | GET | `/api/conversation/<id>` |
95
+
96
+ ### Request parameter
97
+
98
+ | Name | Type | Required | Description |
99
+ |----------|--------|----------|-------------------------------------------------------------|
100
+ | `id` | string | Yes | The unique identifier assigned to a conversation session. `id` must be less than 32 characters and cannot be empty. The following character sets are supported: <br />- 26 lowercase English letters (a-z)<br />- 26 uppercase English letters (A-Z)<br />- 10 digits (0-9)<br />- "_", "-", "." |
101
 
 
 
102
  ### Response
103
+
104
+ #### Response parameter
105
+
106
+ - `message`: All conversations in the specified conversation session.
107
+ - `role`: `"user"` or `"assistant"`.
108
+ - `content`: The text content of user or assistant. The citations are in a format like `##0$$`. The number in the middle, 0 in this case, indicates which part in data.reference.chunks it refers to.
109
+
110
+ - `user_id`: This is set by the caller.
111
+ - `reference`: Each reference corresponds to one of the assistant's answers in `data.message`.
112
+ - `chunks`
113
+ - `content_with_weight`: Content of the chunk.
114
+ - `doc_name`: Name of the *hit* document.
115
+ - `img_id`: The image ID of the chunk. It is an optional field only for PDF, PPTX, and images. Call ['GET' /document/get/<id>](#get-document-content-or-image) to retrieve the image.
116
+ - positions: [page_number, [upleft corner(x, y)], [right bottom(x, y)]], the chunk position, only for PDF.
117
+ - similarity: The hybrid similarity.
118
+ - term_similarity: The keyword simimlarity.
119
+ - vector_similarity: The embedding similarity.
120
+ - `doc_aggs`:
121
+ - `doc_id`: ID of the *hit* document. Call ['GET' /document/get/<id>](#get-document-content-or-image) to retrieve the document.
122
+ - `doc_name`: Name of the *hit* document.
123
+ - `count`: The number of *hit* chunks in this document.
124
+
125
+ <details>
126
+ <summary>Response example</summary>
127
+
128
+ <pre><code>
129
  {
130
  "data": {
131
  "create_date": "Mon, 01 Apr 2024 09:28:42 GMT",
 
154
  "role": "assistant"
155
  }
156
  ],
157
+ "user_id": "<USER_ID_SET_BY_THE_CALLER>",
158
  "reference": [
159
  {
160
  "chunks": [
 
163
  "content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
164
  "content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><th rowspan=2 >Metrics</th><th >GPT-3.5-turbo</th><th></th><th >Claude-2</th><th >GPT-4</th></tr>\n<tr><th >Zero</th><th >Few</th><th >Zero Few</th><th >Zero Few</th></tr>\n<tr><td >CLIP Score</td><td >0.0</td><td >0.0</td><td >0.0 0.2543</td><td >0.0 0.3055</td></tr>\n<tr><td >BERT Score</td><td >0.1914</td><td >0.3820</td><td >0.2111 0.5038</td><td >0.2076 0.6307</td></tr>\n<tr><td >ViT Score</td><td >0.2437</td><td >0.7497</td><td >0.4082 0.5416</td><td >0.5058 0.6480</td></tr>\n<tr><td >Overall</td><td >0.1450</td><td >0.3772</td><td >0.2064 0.4332</td><td >0.2378 0.5281</td></tr>\n</table>",
165
  "doc_id": "c790da40ea8911ee928e0242ac180005",
166
+ "doc_name": "OpenAGI When LLM Meets Domain Experts.pdf",
167
  "img_id": "afab9fdad6e511eebdb20242ac180006-d0bc7892c3ec4aeac071544fd56730a8",
168
  "important_kwd": [],
169
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
 
185
  "content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . here , onli the task descript of the benchmark task are fed into llm(addit inform , such a the input prompt and llm\u2019output , is provid in fig . a.4 and a.5 in supplementari). broadli speak , closed-sourc llm demonstr superior perform on openagi task , with gpt-4 lead the pack under both zero-and few-shot scenario . in the open-sourc categori , llama-2-13b take the lead , consist post top result across variou learn schema--the perform possibl influenc by it larger model size . notabl , open-sourc llm significantli benefit from the tune method , particularli fine-tun and\u2019rltf . these method mark notic enhanc for flan-t5-larg , vicuna-7b , and llama-2-13b when compar with zero-shot and few-shot learn schema . in fact , each of these open-sourc model hit it pinnacl under the rltf approach . conclus , with rltf tune , the perform of llama-2-13b approach that of gpt-3.5 , illustr it potenti .",
186
  "content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores. Here, only the task descriptions of the benchmark tasks are fed into LLMs (additional information, such as the input prompt and LLMs\u2019 outputs, is provided in Fig. A.4 and A.5 in supplementary). Broadly speaking, closed-source LLMs demonstrate superior performance on OpenAGI tasks, with GPT-4 leading the pack under both zero- and few-shot scenarios. In the open-source category, LLaMA-2-13B takes the lead, consistently posting top results across various learning schema--the performance possibly influenced by its larger model size. Notably, open-source LLMs significantly benefit from the tuning methods, particularly Fine-tuning and\u2019 RLTF. These methods mark noticeable enhancements for Flan-T5-Large, Vicuna-7B, and LLaMA-2-13B when compared with zero-shot and few-shot learning schema. In fact, each of these open-source models hits its pinnacle under the RLTF approach. Conclusively, with RLTF tuning, the performance of LLaMA-2-13B approaches that of GPT-3.5, illustrating its potential.",
187
  "doc_id": "c790da40ea8911ee928e0242ac180005",
188
+ "doc_name": "OpenAGI When LLM Meets Domain Experts.pdf",
189
  "img_id": "afab9fdad6e511eebdb20242ac180006-7e2345d440383b756670e1b0f43a7007",
190
  "important_kwd": [],
191
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
 
219
  "content_ltks": "nvlink bridg support nvidia\u00aenvlink\u00aei a high-spe point-to-point peer transfer connect , where one gpu can transfer data to and receiv data from one other gpu . the nvidia a100 card support nvlink bridg connect with a singl adjac a100 card . each of the three attach bridg span two pcie slot . to function correctli a well a to provid peak bridg bandwidth , bridg connect with an adjac a100 card must incorpor all three nvlink bridg . wherev an adjac pair of a100 card exist in the server , for best bridg perform and balanc bridg topolog , the a100 pair should be bridg . figur 4 illustr correct and incorrect a100 nvlink connect topolog . nvlink topolog\u2013top view figur 4. correct incorrect correct incorrect for system that featur multipl cpu , both a100 card of a bridg card pair should be within the same cpu domain\u2014that is , under the same cpu\u2019s topolog . ensur thi benefit workload applic perform . the onli except is for dual cpu system wherein each cpu ha a singl a100 pcie card under it;in that case , the two a100 pcie card in the system may be bridg togeth . a100 nvlink speed and bandwidth are given in the follow tabl . tabl 5. a100 nvlink speed and bandwidth paramet valu total nvlink bridg support by nvidia a100 3 total nvlink rx and tx lane support 96 data rate per nvidia a100 nvlink lane(each direct)50 gbp total maximum nvlink bandwidth 600 gbyte per second pb-10137-001_v03|8 nvidia a100 40gb pcie gpu acceler",
220
  "content_with_weight": "NVLink Bridge Support\nNVIDIA\u00aeNVLink\u00aeis a high-speed point-to-point peer transfer connection, where one GPU can transfer data to and receive data from one other GPU. The NVIDIA A100 card supports NVLink bridge connection with a single adjacent A100 card.\nEach of the three attached bridges spans two PCIe slots. To function correctly as well as to provide peak bridge bandwidth, bridge connection with an adjacent A100 card must incorporate all three NVLink bridges. Wherever an adjacent pair of A100 cards exists in the server, for best bridging performance and balanced bridge topology, the A100 pair should be bridged. Figure 4 illustrates correct and incorrect A100 NVLink connection topologies.\nNVLink Topology \u2013Top Views \nFigure 4. \nCORRECT \nINCORRECT \nCORRECT \nINCORRECT \nFor systems that feature multiple CPUs, both A100 cards of a bridged card pair should be within the same CPU domain\u2014that is, under the same CPU\u2019s topology. Ensuring this benefits workload application performance. The only exception is for dual CPU systems wherein each CPU has a single A100 PCIe card under it; in that case, the two A100 PCIe cards in the system may be bridged together.\nA100 NVLink speed and bandwidth are given in the following table.\n<table><caption>Table 5. A100 NVLink Speed and Bandwidth </caption>\n<tr><th >Parameter </th><th >Value </th></tr>\n<tr><td >Total NVLink bridges supported by NVIDIA A100 </td><td >3 </td></tr>\n<tr><td >Total NVLink Rx and Tx lanes supported </td><td >96 </td></tr>\n<tr><td >Data rate per NVIDIA A100 NVLink lane (each direction)</td><td >50 Gbps </td></tr>\n<tr><td >Total maximum NVLink bandwidth</td><td >600 Gbytes per second </td></tr>\n</table>\nPB-10137-001_v03 |8\nNVIDIA A100 40GB PCIe GPU Accelerator",
221
  "doc_id": "806d1ed0ea9311ee860a0242ac180005",
222
+ "doc_name": "A100-PCIE-Prduct-Brief.pdf",
223
  "img_id": "afab9fdad6e511eebdb20242ac180006-8c11a1edddb21ad2ae0c43b4a5dcfa62",
224
  "important_kwd": [],
225
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
 
253
  "retcode": 0,
254
  "retmsg": "success"
255
  }
256
+ </code></pre>
257
+ </details>
258
 
 
 
 
259
 
260
+ ## Get answer
261
+
262
+ This method retrieves from RAGFlow the answer to the user's latest question.
263
+
264
+ ### Request
 
 
 
 
 
 
 
 
 
 
 
265
 
266
+ #### Request URI
267
 
268
+ | Method | Request URI |
269
+ |----------|-------------------------------------------------------------|
270
+ | POST | `/api/completion` |
271
 
272
+ ### Request parameter
273
+
274
+ | Name | Type | Required | Description |
275
+ |------------------|--------|----------|---------------|
276
+ | `conversation_id`| string | Yes | The ID of the conversation session. Call ['GET' /new_conversation](#create-conversation) to retrieve the ID.|
277
+ | `messages` | json | Yes | The latest question in a JSON form, such as `[{"role": "user", "content": "How are you doing!"}]`|
278
+ | `quote` | bool | No | Default: true |
279
+ | `stream` | bool | No | Default: true |
280
+ | `doc_ids` | string | No | Document IDs delimited by comma, like `c790da40ea8911ee928e0242ac180005,23dsf34ree928e0242ac180005`. The retrieved contents will be confined to these documents. |
281
 
282
  ### Response
283
+
284
+ - `answer`: The answer to the user's latest question.
285
+ - `reference`:
286
+ - `chunks`: The retrieved chunks that contribute to the answer.
287
+ - `content_with_weight`: Content of the chunk.
288
+ - `doc_name`: Name of the *hit* document.
289
+ - `img_id`: The image ID of the chunk. It is an optional field only for PDF, PPTX, and images. Call ['GET' /document/get/<id>](#get-document-content-or-image) to retrieve the image.
290
+ - `positions`: [page_number, [upleft corner(x, y)], [right bottom(x, y)]], the chunk position, only for PDF.
291
+ - `similarity`: The hybrid similarity.
292
+ - `term_similarity`: The keyword simimlarity.
293
+ - `vector_similarity`: The embedding similarity.
294
+ - `doc_aggs`:
295
+ - `doc_id`: ID of the *hit* document. Call ['GET' /document/get/<id>](#get-document-content-or-image) to retrieve the document.
296
+ - `doc_name`: Name of the *hit* document.
297
+ - `count`: The number of *hit* chunks in this document.
298
+
299
+ <details>
300
+ <summary>Response example</summary>
301
+
302
+ <pre><code>
303
  {
304
  "data": {
305
  "answer": "The ViT Score for GPT-4 in the zero-shot scenario is 0.5058, and in the few-shot scenario, it is 0.6480. ##0$$",
 
310
  "content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
311
  "content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><th rowspan=2 >Metrics</th><th >GPT-3.5-turbo</th><th></th><th >Claude-2</th><th >GPT-4</th></tr>\n<tr><th >Zero</th><th >Few</th><th >Zero Few</th><th >Zero Few</th></tr>\n<tr><td >CLIP Score</td><td >0.0</td><td >0.0</td><td >0.0 0.2543</td><td >0.0 0.3055</td></tr>\n<tr><td >BERT Score</td><td >0.1914</td><td >0.3820</td><td >0.2111 0.5038</td><td >0.2076 0.6307</td></tr>\n<tr><td >ViT Score</td><td >0.2437</td><td >0.7497</td><td >0.4082 0.5416</td><td >0.5058 0.6480</td></tr>\n<tr><td >Overall</td><td >0.1450</td><td >0.3772</td><td >0.2064 0.4332</td><td >0.2378 0.5281</td></tr>\n</table>",
312
  "doc_id": "c790da40ea8911ee928e0242ac180005",
313
+ "doc_name": "OpenAGI When LLM Meets Domain Experts.pdf",
314
  "img_id": "afab9fdad6e511eebdb20242ac180006-d0bc7892c3ec4aeac071544fd56730a8",
315
  "important_kwd": [],
316
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
 
332
  "content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . here , onli the task descript of the benchmark task are fed into llm(addit inform , such a the input prompt and llm\u2019output , is provid in fig . a.4 and a.5 in supplementari). broadli speak , closed-sourc llm demonstr superior perform on openagi task , with gpt-4 lead the pack under both zero-and few-shot scenario . in the open-sourc categori , llama-2-13b take the lead , consist post top result across variou learn schema--the perform possibl influenc by it larger model size . notabl , open-sourc llm significantli benefit from the tune method , particularli fine-tun and\u2019rltf . these method mark notic enhanc for flan-t5-larg , vicuna-7b , and llama-2-13b when compar with zero-shot and few-shot learn schema . in fact , each of these open-sourc model hit it pinnacl under the rltf approach . conclus , with rltf tune , the perform of llama-2-13b approach that of gpt-3.5 , illustr it potenti .",
333
  "content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores. Here, only the task descriptions of the benchmark tasks are fed into LLMs (additional information, such as the input prompt and LLMs\u2019 outputs, is provided in Fig. A.4 and A.5 in supplementary). Broadly speaking, closed-source LLMs demonstrate superior performance on OpenAGI tasks, with GPT-4 leading the pack under both zero- and few-shot scenarios. In the open-source category, LLaMA-2-13B takes the lead, consistently posting top results across various learning schema--the performance possibly influenced by its larger model size. Notably, open-source LLMs significantly benefit from the tuning methods, particularly Fine-tuning and\u2019 RLTF. These methods mark noticeable enhancements for Flan-T5-Large, Vicuna-7B, and LLaMA-2-13B when compared with zero-shot and few-shot learning schema. In fact, each of these open-source models hits its pinnacle under the RLTF approach. Conclusively, with RLTF tuning, the performance of LLaMA-2-13B approaches that of GPT-3.5, illustrating its potential.",
334
  "doc_id": "c790da40ea8911ee928e0242ac180005",
335
+ "doc_name": "OpenAGI When LLM Meets Domain Experts.pdf",
336
  "img_id": "afab9fdad6e511eebdb20242ac180006-7e2345d440383b756670e1b0f43a7007",
337
  "important_kwd": [],
338
  "kb_id": "afab9fdad6e511eebdb20242ac180006",
 
359
  "retcode": 0,
360
  "retmsg": "success"
361
  }
362
+ </code></pre>
363
+ </details>
364
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
365
  ## Get document content or image
366
 
367
+ This method retrieves the content or a specific image in a document. Used if you intend to display the content of a citation.
368
+
369
+ ### Request
370
+
371
+ #### Request URI
372
+
373
+ | Method | Request URI |
374
+ |----------|-------------------------------------------------------------|
375
+ | GET | `/api/document/get/<id>` |
376
+
377
+ ### Response
378
 
379
  ## Upload file
380
 
381
+ This method uploads a specific file to a specified knowledge base.
382
+
383
+ ### Request
384
 
385
+ #### Request URI
386
+
387
+ | Method | Request URI |
388
+ |----------|-------------------------------------------------------------|
389
+ | POST | `/api/document/upload` |
390
+
391
+ ### Response parameter
392
+
393
+ | Name | Type | Required | Description |
394
+ |-------------|--------|----------|---------------------------------------------------------|
395
+ | `file` | file | Yes | The file to upload. |
396
+ | `kb_name` | string | Yes | The name of the knowledge base to upload the file to. |
397
+ | `parser_id` | string | No | The parsing method (chunk template) to use. <br />- "naive": General;<br />- "qa": Q&A;<br />- "manual": Manual;<br />- "table": Table;<br />- "paper": Paper;<br />- "laws": Laws;<br />- "presentation": Presentation;<br />- "picture": Picture;<br />- "one": One. |
398
+ | `run` | string | No | 1: Automatically start file parsing. If `parser_id` is not set, RAGFlow uses the general template by default. |
399
 
 
 
 
 
 
 
400
 
401
  ### Response
402
+
403
+ <details>
404
+ <summary>Response example</summary>
405
+ <pre><code>
406
  {
407
  "data": {
408
  "chunk_num": 0,
 
442
  "retmsg": "success"
443
  }
444
 
445
+ </code></pre>
446
+ </details>
447
 
448
  ## Get document chunks
449
 
450
+ This method retrieves the chunks of a specific document by `doc_name` or `doc_id`.
 
 
451
 
452
+ ### Request
453
 
454
+ #### Request URI
 
 
 
455
 
456
+ | Method | Request URI |
457
+ |----------|-------------------------------------------------------------|
458
+ | GET | `/api/list_chunks` |
459
 
460
+ #### Request parameter
461
+
462
+ | Name | Type | Required | Description |
463
+ |------------|--------|----------|---------------------------------------------------------------------------------------------|
464
+ | `doc_name` | string | No | The name of the document in the knowledge base. It must not be empty if `doc_id` is not set.|
465
+ | `doc_id` | string | No | The ID of the document in the knowledge base. It must not be empty if `doc_name` is not set.|
466
+
467
+
468
+ ### Response
469
+
470
+ <details>
471
+ <summary>Response example</summary>
472
+ <pre><code>
473
  {
474
  "data": [
475
  {
 
478
  "img_id": "0335167613f011ef91240242ac120006-b46c3524952f82dbe061ce9b123f2211"
479
  },
480
  {
481
+ "content": "4.3 ProcessingOverheadof RL-CacheACKNOWLEDGMENTSThis section evaluates how effectively our RL-Cache implemen-tation leverages modern multi-core CPUs and GPUs to keep the per-request neural-net processing overhead low. Figure 14 depictsThis researchwas supported inpart by the Regional Government of Madrid (grant P2018/TCS-4499, EdgeData-CM)andU.S. National Science Foundation (grants CNS-1763617 andCNS-1717179).REFERENCES",
482
  "doc_name": "RL-Cache.pdf",
483
  "img_id": "0335167613f011ef91240242ac120006-d4c12c43938eb55d2d8278eea0d7e6d7"
484
  }
 
487
  "retmsg": "success"
488
  }
489
 
490
+ </code></pre>
491
+ </details>
492
 
493
+ ## Get document list
494
 
495
+ This method retrieves a list of documents from a specified knowledge base.
 
 
496
 
497
+ ### Request
498
 
499
+ #### Request URI
500
+
501
+ | Method | Request URI |
502
+ |----------|-------------------------------------------------------------|
503
+ | POST | `/api/list_kb_docs` |
504
+
505
+ #### Request parameter
506
+
507
+ | Name | Type | Required | Description |
508
+ |-------------|--------|----------|-----------------------------------------------------------------------|
509
+ | `kb_name` | string | Yes | The name of the knowledge base, from which you get the document list. |
510
+ | `page` | int | No | The number of pages, default:1. |
511
+ | `page_size` | int | No | The number of docs for each page, default:15. |
512
+ | `orderby` | string | No | `chunk_num`, `create_time`, or `size`, default:`create_time` |
513
+ | `desc` | bool | No | Default:True. |
514
+ | `keywords` | string | No | Keyword of the document name. |
515
 
516
 
517
  ### Response
518
+
519
+ <details>
520
+ <summary>Response example</summary>
521
+ <pre><code>
522
  {
523
  "data": {
524
  "docs": [
 
537
  "retmsg": "success"
538
  }
539
 
540
+ </code></pre>
541
+ </details>
542
+
543
+ ## Delete documents
544
+
545
+ This method deletes documents by document ID or name.
546
+
547
+ ### Request
548
 
549
+ #### Request URI
550
 
551
+ | Method | Request URI |
552
+ |----------|-------------------------------------------------------------|
553
+ | DELETE | `/api/document` |
554
 
555
+ #### Request parameter
556
 
557
+ | Name | Type | Required | Description |
558
  |-------------|--------|----------|----------------------------|
559
+ | `doc_names` | List | No | A list of document names. It must not be empty if `doc_ids` is not set. |
560
+ | `doc_ids` | List | No | A list of document IDs. It must not be empty if `doc_names` is not set. |
561
 
562
 
563
+ ### Response
564
+
565
+ <details>
566
+ <summary>Response example</summary>
567
+ <pre><code>
568
  {
569
  "data": true,
570
  "retcode": 0,
571
  "retmsg": "success"
572
  }
573
 
574
+ </code></pre>
575
+ </details>