writinwaters
commited on
Commit
·
5ec7450
1
Parent(s):
c31ab66
DRAFT: Updated python and http api references (#2973)
Browse files### What problem does this PR solve?
### Type of change
- [x] Documentation Update
- api/http_api_reference.md +57 -45
- api/python_api_reference.md +46 -44
api/http_api_reference.md
CHANGED
@@ -20,7 +20,7 @@ Creates a dataset.
|
|
20 |
### Request
|
21 |
|
22 |
- Method: POST
|
23 |
-
- URL:
|
24 |
- Headers:
|
25 |
- `'content-Type: application/json'`
|
26 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -163,7 +163,7 @@ Deletes datasets by ID.
|
|
163 |
### Request
|
164 |
|
165 |
- Method: DELETE
|
166 |
-
- URL:
|
167 |
- Headers:
|
168 |
- `'content-Type: application/json'`
|
169 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -219,7 +219,7 @@ Updates configurations for a specified dataset.
|
|
219 |
### Request
|
220 |
|
221 |
- Method: PUT
|
222 |
-
- URL:
|
223 |
- Headers:
|
224 |
- `'content-Type: application/json'`
|
225 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -243,8 +243,6 @@ curl --request PUT \
|
|
243 |
--data '{
|
244 |
"name": "test",
|
245 |
"embedding_model": "BAAI/bge-zh-v1.5",
|
246 |
-
"chunk_count": 0,
|
247 |
-
"document_count": 0,
|
248 |
"parse_method": "naive"
|
249 |
}'
|
250 |
```
|
@@ -293,14 +291,12 @@ An error response includes a JSON object like the following:
|
|
293 |
|
294 |
**GET** `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
295 |
|
296 |
-
Lists
|
297 |
-
|
298 |
-
Retrieves a list of datasets.
|
299 |
|
300 |
### Request
|
301 |
|
302 |
- Method: GET
|
303 |
-
- URL:
|
304 |
- Headers:
|
305 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
306 |
|
@@ -407,10 +403,10 @@ Uploads documents to a specified dataset.
|
|
407 |
- Method: POST
|
408 |
- URL: `/api/v1/dataset/{dataset_id}/document`
|
409 |
- Headers:
|
410 |
-
- 'Content-Type: multipart/form-data'
|
411 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
412 |
- Form:
|
413 |
-
- 'file=@{FILE_PATH}'
|
414 |
|
415 |
#### Request example
|
416 |
|
@@ -425,9 +421,9 @@ curl --request POST \
|
|
425 |
#### Request parameters
|
426 |
|
427 |
- `"dataset_id"`: (*Path parameter*)
|
428 |
-
The dataset
|
429 |
- `"file"`: (*Body parameter*)
|
430 |
-
The
|
431 |
|
432 |
### Response
|
433 |
|
@@ -459,25 +455,25 @@ Updates configurations for a specified document.
|
|
459 |
### Request
|
460 |
|
461 |
- Method: PUT
|
462 |
-
- URL:
|
463 |
- Headers:
|
464 |
- `'content-Type: application/json'`
|
465 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
466 |
- Body:
|
467 |
-
- `name`:`string`
|
468 |
-
- `
|
469 |
-
- `parser_config`:`dict`
|
470 |
|
471 |
#### Request example
|
472 |
|
473 |
```bash
|
474 |
curl --request PUT \
|
475 |
--url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
|
476 |
-
--header 'Authorization: Bearer {
|
477 |
--header 'Content-Type: application/json' \
|
478 |
--data '{
|
479 |
"name": "manual.txt",
|
480 |
-
"
|
481 |
"parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12}
|
482 |
}'
|
483 |
|
@@ -485,8 +481,24 @@ curl --request PUT \
|
|
485 |
|
486 |
#### Request parameters
|
487 |
|
488 |
-
- `"
|
489 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
490 |
|
491 |
- `"parser_config"`: (*Body parameter*)
|
492 |
Configuration object for the parser.
|
@@ -525,7 +537,7 @@ Downloads a document from a specified dataset.
|
|
525 |
### Request
|
526 |
|
527 |
- Method: GET
|
528 |
-
- URL:
|
529 |
- Headers:
|
530 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
531 |
- Output:
|
@@ -570,7 +582,7 @@ An error response includes a JSON object like the following:
|
|
570 |
|
571 |
**GET** `/api/v1/dataset/{dataset_id}/info?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}`
|
572 |
|
573 |
-
|
574 |
|
575 |
### Request
|
576 |
|
@@ -670,7 +682,7 @@ Deletes documents by ID.
|
|
670 |
### Request
|
671 |
|
672 |
- Method: DELETE
|
673 |
-
- URL:
|
674 |
- Headers:
|
675 |
- `'Content-Type: application/json'`
|
676 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -724,7 +736,7 @@ Parses documents in a specified dataset.
|
|
724 |
### Request
|
725 |
|
726 |
- Method: POST
|
727 |
-
- URL:
|
728 |
- Headers:
|
729 |
- `'content-Type: application/json'`
|
730 |
- 'Authorization: Bearer {YOUR_API_KEY}'
|
@@ -777,7 +789,7 @@ Stops parsing specified documents.
|
|
777 |
### Request
|
778 |
|
779 |
- Method: DELETE
|
780 |
-
- URL:
|
781 |
- Headers:
|
782 |
- `'content-Type: application/json'`
|
783 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -831,7 +843,7 @@ Adds a chunk to a specified document in a specified dataset.
|
|
831 |
### Request
|
832 |
|
833 |
- Method: POST
|
834 |
-
- URL:
|
835 |
- Headers:
|
836 |
- `'content-Type: application/json'`
|
837 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -896,12 +908,12 @@ An error response includes a JSON object like the following:
|
|
896 |
|
897 |
**GET** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
|
898 |
|
899 |
-
|
900 |
|
901 |
### Request
|
902 |
|
903 |
- Method: GET
|
904 |
-
- URL:
|
905 |
- Headers:
|
906 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
907 |
|
@@ -992,7 +1004,7 @@ Deletes chunks by ID.
|
|
992 |
### Request
|
993 |
|
994 |
- Method: DELETE
|
995 |
-
- URL:
|
996 |
- Headers:
|
997 |
- `'content-Type: application/json'`
|
998 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1046,7 +1058,7 @@ Updates content or configurations for a specified chunk.
|
|
1046 |
### Request
|
1047 |
|
1048 |
- Method: PUT
|
1049 |
-
- URL:
|
1050 |
- Headers:
|
1051 |
- `'content-Type: application/json'`
|
1052 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1102,12 +1114,12 @@ An error response includes a JSON object like the following:
|
|
1102 |
|
1103 |
**GET** `/api/v1/retrieval`
|
1104 |
|
1105 |
-
|
1106 |
|
1107 |
### Request
|
1108 |
|
1109 |
- Method: POST
|
1110 |
-
- URL:
|
1111 |
- Headers:
|
1112 |
- `'content-Type: application/json'`
|
1113 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1252,7 +1264,7 @@ Creates a chat assistant.
|
|
1252 |
### Request
|
1253 |
|
1254 |
- Method: POST
|
1255 |
-
- URL:
|
1256 |
- Headers:
|
1257 |
- `'content-Type: application/json'`
|
1258 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1486,7 +1498,7 @@ Updates configurations for a specified chat assistant.
|
|
1486 |
### Request
|
1487 |
|
1488 |
- Method: PUT
|
1489 |
-
- URL:
|
1490 |
- Headers:
|
1491 |
- `'content-Type: application/json'`
|
1492 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1538,7 +1550,7 @@ Deletes chat assistants by ID.
|
|
1538 |
### Request
|
1539 |
|
1540 |
- Method: DELETE
|
1541 |
-
- URL:
|
1542 |
- Headers:
|
1543 |
- `'content-Type: application/json'`
|
1544 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1586,16 +1598,16 @@ An error response includes a JSON object like the following:
|
|
1586 |
|
1587 |
---
|
1588 |
|
1589 |
-
## List chats
|
1590 |
|
1591 |
-
**GET** `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={
|
1592 |
|
1593 |
-
|
1594 |
|
1595 |
### Request
|
1596 |
|
1597 |
- Method: GET
|
1598 |
-
- URL:
|
1599 |
- Headers:
|
1600 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
1601 |
|
@@ -1732,7 +1744,7 @@ Create a chat session.
|
|
1732 |
### Request
|
1733 |
|
1734 |
- Method: POST
|
1735 |
-
- URL:
|
1736 |
- Headers:
|
1737 |
- `'content-Type: application/json'`
|
1738 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1827,7 +1839,7 @@ Update a chat session
|
|
1827 |
### Request
|
1828 |
|
1829 |
- Method: PUT
|
1830 |
-
- URL:
|
1831 |
- Headers:
|
1832 |
- `'content-Type: application/json'`
|
1833 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -1882,7 +1894,7 @@ Lists sessions associated with a specified????????????? chat assistant.
|
|
1882 |
### Request
|
1883 |
|
1884 |
- Method: GET
|
1885 |
-
- URL:
|
1886 |
- Headers:
|
1887 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
1888 |
|
@@ -1967,7 +1979,7 @@ Deletes sessions by ID.
|
|
1967 |
### Request
|
1968 |
|
1969 |
- Method: DELETE
|
1970 |
-
- URL:
|
1971 |
- Headers:
|
1972 |
- `'content-Type: application/json'`
|
1973 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
@@ -2023,7 +2035,7 @@ Asks a question to start a conversation.
|
|
2023 |
### Request
|
2024 |
|
2025 |
- Method: POST
|
2026 |
-
- URL:
|
2027 |
- Headers:
|
2028 |
- `'content-Type: application/json'`
|
2029 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
20 |
### Request
|
21 |
|
22 |
- Method: POST
|
23 |
+
- URL: `/api/v1/dataset`
|
24 |
- Headers:
|
25 |
- `'content-Type: application/json'`
|
26 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
163 |
### Request
|
164 |
|
165 |
- Method: DELETE
|
166 |
+
- URL: `/api/v1/dataset`
|
167 |
- Headers:
|
168 |
- `'content-Type: application/json'`
|
169 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
219 |
### Request
|
220 |
|
221 |
- Method: PUT
|
222 |
+
- URL: `/api/v1/dataset/{dataset_id}`
|
223 |
- Headers:
|
224 |
- `'content-Type: application/json'`
|
225 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
243 |
--data '{
|
244 |
"name": "test",
|
245 |
"embedding_model": "BAAI/bge-zh-v1.5",
|
|
|
|
|
246 |
"parse_method": "naive"
|
247 |
}'
|
248 |
```
|
|
|
291 |
|
292 |
**GET** `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
293 |
|
294 |
+
Lists datasets.
|
|
|
|
|
295 |
|
296 |
### Request
|
297 |
|
298 |
- Method: GET
|
299 |
+
- URL: `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
300 |
- Headers:
|
301 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
302 |
|
|
|
403 |
- Method: POST
|
404 |
- URL: `/api/v1/dataset/{dataset_id}/document`
|
405 |
- Headers:
|
406 |
+
- `'Content-Type: multipart/form-data'`
|
407 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
408 |
- Form:
|
409 |
+
- `'file=@{FILE_PATH}'`
|
410 |
|
411 |
#### Request example
|
412 |
|
|
|
421 |
#### Request parameters
|
422 |
|
423 |
- `"dataset_id"`: (*Path parameter*)
|
424 |
+
The ID of the dataset to which the documents will be uploaded.
|
425 |
- `"file"`: (*Body parameter*)
|
426 |
+
The document???? to upload.
|
427 |
|
428 |
### Response
|
429 |
|
|
|
455 |
### Request
|
456 |
|
457 |
- Method: PUT
|
458 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}`
|
459 |
- Headers:
|
460 |
- `'content-Type: application/json'`
|
461 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
462 |
- Body:
|
463 |
+
- `"name"`:`string`
|
464 |
+
- `"chunk_method"`:`string`
|
465 |
+
- `"parser_config"`:`dict`
|
466 |
|
467 |
#### Request example
|
468 |
|
469 |
```bash
|
470 |
curl --request PUT \
|
471 |
--url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
|
472 |
+
--header 'Authorization: Bearer {YOUR_API_KEY}' \
|
473 |
--header 'Content-Type: application/json' \
|
474 |
--data '{
|
475 |
"name": "manual.txt",
|
476 |
+
"chunk_method": "manual",
|
477 |
"parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12}
|
478 |
}'
|
479 |
|
|
|
481 |
|
482 |
#### Request parameters
|
483 |
|
484 |
+
- `"name"`: (*Body parameter*), `string`
|
485 |
+
- `"chunk_method"`: (*Body parameter*), `string`
|
486 |
+
The parsing method to apply to the document.
|
487 |
+
- `"naive"`: General
|
488 |
+
- `"manual`: Manual
|
489 |
+
- `"qa"`: Q&A
|
490 |
+
- `"table"`: Table
|
491 |
+
- `"paper"`: Paper
|
492 |
+
- `"book"`: Book
|
493 |
+
- `"laws"`: Laws
|
494 |
+
- `"presentation"`: Presentation
|
495 |
+
- `"picture"`: Picture
|
496 |
+
- `"one"`: One
|
497 |
+
- `"knowledge_graph"`: Knowledge Graph
|
498 |
+
- `"email"`: Email
|
499 |
+
-
|
500 |
+
|
501 |
+
### Returns
|
502 |
|
503 |
- `"parser_config"`: (*Body parameter*)
|
504 |
Configuration object for the parser.
|
|
|
537 |
### Request
|
538 |
|
539 |
- Method: GET
|
540 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}`
|
541 |
- Headers:
|
542 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
543 |
- Output:
|
|
|
582 |
|
583 |
**GET** `/api/v1/dataset/{dataset_id}/info?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}`
|
584 |
|
585 |
+
Lists documents in a specified dataset.
|
586 |
|
587 |
### Request
|
588 |
|
|
|
682 |
### Request
|
683 |
|
684 |
- Method: DELETE
|
685 |
+
- URL: `/api/v1/dataset/{dataset_id}/document`
|
686 |
- Headers:
|
687 |
- `'Content-Type: application/json'`
|
688 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
736 |
### Request
|
737 |
|
738 |
- Method: POST
|
739 |
+
- URL: `/api/v1/dataset/{dataset_id}/chunk `
|
740 |
- Headers:
|
741 |
- `'content-Type: application/json'`
|
742 |
- 'Authorization: Bearer {YOUR_API_KEY}'
|
|
|
789 |
### Request
|
790 |
|
791 |
- Method: DELETE
|
792 |
+
- URL: `/api/v1/dataset/{dataset_id}/chunk`
|
793 |
- Headers:
|
794 |
- `'content-Type: application/json'`
|
795 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
843 |
### Request
|
844 |
|
845 |
- Method: POST
|
846 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
|
847 |
- Headers:
|
848 |
- `'content-Type: application/json'`
|
849 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
908 |
|
909 |
**GET** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
|
910 |
|
911 |
+
Lists chunks in a specified document.
|
912 |
|
913 |
### Request
|
914 |
|
915 |
- Method: GET
|
916 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
|
917 |
- Headers:
|
918 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
919 |
|
|
|
1004 |
### Request
|
1005 |
|
1006 |
- Method: DELETE
|
1007 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
|
1008 |
- Headers:
|
1009 |
- `'content-Type: application/json'`
|
1010 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1058 |
### Request
|
1059 |
|
1060 |
- Method: PUT
|
1061 |
+
- URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk/{chunk_id}`
|
1062 |
- Headers:
|
1063 |
- `'content-Type: application/json'`
|
1064 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1114 |
|
1115 |
**GET** `/api/v1/retrieval`
|
1116 |
|
1117 |
+
Retrieves chunks from specified datasets.
|
1118 |
|
1119 |
### Request
|
1120 |
|
1121 |
- Method: POST
|
1122 |
+
- URL: `/api/v1/retrieval`
|
1123 |
- Headers:
|
1124 |
- `'content-Type: application/json'`
|
1125 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1264 |
### Request
|
1265 |
|
1266 |
- Method: POST
|
1267 |
+
- URL: `/api/v1/chat`
|
1268 |
- Headers:
|
1269 |
- `'content-Type: application/json'`
|
1270 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1498 |
### Request
|
1499 |
|
1500 |
- Method: PUT
|
1501 |
+
- URL: `/api/v1/chat/{chat_id}`
|
1502 |
- Headers:
|
1503 |
- `'content-Type: application/json'`
|
1504 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1550 |
### Request
|
1551 |
|
1552 |
- Method: DELETE
|
1553 |
+
- URL: `/api/v1/chat`
|
1554 |
- Headers:
|
1555 |
- `'content-Type: application/json'`
|
1556 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1598 |
|
1599 |
---
|
1600 |
|
1601 |
+
## List chats
|
1602 |
|
1603 |
+
**GET** `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={chat_name}&id={chat_id}`
|
1604 |
|
1605 |
+
Lists chat assistants.
|
1606 |
|
1607 |
### Request
|
1608 |
|
1609 |
- Method: GET
|
1610 |
+
- URL: `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
1611 |
- Headers:
|
1612 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
1613 |
|
|
|
1744 |
### Request
|
1745 |
|
1746 |
- Method: POST
|
1747 |
+
- URL: `/api/v1/chat/{chat_id}/session`
|
1748 |
- Headers:
|
1749 |
- `'content-Type: application/json'`
|
1750 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1839 |
### Request
|
1840 |
|
1841 |
- Method: PUT
|
1842 |
+
- URL: `/api/v1/chat/{chat_id}/session/{session_id}`
|
1843 |
- Headers:
|
1844 |
- `'content-Type: application/json'`
|
1845 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
1894 |
### Request
|
1895 |
|
1896 |
- Method: GET
|
1897 |
+
- URL: `/api/v1/chat/{chat_id}/session?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
1898 |
- Headers:
|
1899 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
1900 |
|
|
|
1979 |
### Request
|
1980 |
|
1981 |
- Method: DELETE
|
1982 |
+
- URL: `/api/v1/chat/{chat_id}/session`
|
1983 |
- Headers:
|
1984 |
- `'content-Type: application/json'`
|
1985 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
|
|
2035 |
### Request
|
2036 |
|
2037 |
- Method: POST
|
2038 |
+
- URL: `/api/v1/chat/{chat_id}/completion`
|
2039 |
- Headers:
|
2040 |
- `'content-Type: application/json'`
|
2041 |
- `'Authorization: Bearer {YOUR_API_KEY}'`
|
api/python_api_reference.md
CHANGED
@@ -17,10 +17,9 @@ RAGFlow.create_dataset(
|
|
17 |
name: str,
|
18 |
avatar: str = "",
|
19 |
description: str = "",
|
|
|
20 |
language: str = "English",
|
21 |
permission: str = "me",
|
22 |
-
document_count: int = 0,
|
23 |
-
chunk_count: int = 0,
|
24 |
chunk_method: str = "naive",
|
25 |
parser_config: DataSet.ParserConfig = None
|
26 |
) -> DataSet
|
@@ -143,7 +142,7 @@ RAGFlow.list_datasets(
|
|
143 |
) -> list[DataSet]
|
144 |
```
|
145 |
|
146 |
-
|
147 |
|
148 |
### Parameters
|
149 |
|
@@ -296,7 +295,7 @@ Updates configurations for the current document.
|
|
296 |
|
297 |
A dictionary representing the attributes to update, with the following keys:
|
298 |
|
299 |
-
- `"
|
300 |
- `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
|
301 |
- `"chunk_token_count"`: Defaults to `128`.
|
302 |
- `"layout_recognize"`: Defaults to `True`.
|
@@ -370,7 +369,7 @@ print(doc)
|
|
370 |
Dataset.list_documents(id:str =None, keywords: str=None, offset: int=0, limit:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
|
371 |
```
|
372 |
|
373 |
-
|
374 |
|
375 |
### Parameters
|
376 |
|
@@ -388,7 +387,7 @@ The starting index for the documents to retrieve. Typically used in confunction
|
|
388 |
|
389 |
#### limit: `int`
|
390 |
|
391 |
-
The maximum number of documents to retrieve. Defaults to `1024`.
|
392 |
|
393 |
#### orderby: `str`
|
394 |
|
@@ -412,7 +411,7 @@ A `Document` object contains the following attributes:
|
|
412 |
- `name`: The document name. Defaults to `""`.
|
413 |
- `thumbnail`: The thumbnail image of the document. Defaults to `None`.
|
414 |
- `knowledgebase_id`: The dataset ID associated with the document. Defaults to `None`.
|
415 |
-
- `chunk_method` The chunk method name. Defaults to `""`.
|
416 |
- `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `{"pages": [[1, 1000000]]}`.
|
417 |
- `source_type`: The source type of the document. Defaults to `"local"`.
|
418 |
- `type`: Type or category of the document. Defaults to `""`. Reserved for future use.
|
@@ -425,7 +424,7 @@ A `Document` object contains the following attributes:
|
|
425 |
- `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
|
426 |
- `process_duation`: `float` Duration of the processing in seconds. Defaults to `0.0`.
|
427 |
- `run`: `str` The document's processing status:
|
428 |
-
- `"0"`: UNSTART (default)
|
429 |
- `"1"`: RUNNING
|
430 |
- `"2"`: CANCEL
|
431 |
- `"3"`: DONE
|
@@ -506,9 +505,9 @@ The IDs of the documents to parse.
|
|
506 |
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
507 |
dataset = rag_object.create_dataset(name="dataset_name")
|
508 |
documents = [
|
509 |
-
{'
|
510 |
-
{'
|
511 |
-
{'
|
512 |
]
|
513 |
dataset.upload_documents(documents)
|
514 |
documents = dataset.list_documents(keywords="test")
|
@@ -546,9 +545,9 @@ The IDs of the documents for which parsing should be stopped.
|
|
546 |
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
547 |
dataset = rag_object.create_dataset(name="dataset_name")
|
548 |
documents = [
|
549 |
-
{'
|
550 |
-
{'
|
551 |
-
{'
|
552 |
]
|
553 |
dataset.upload_documents(documents)
|
554 |
documents = dataset.list_documents(keywords="test")
|
@@ -566,7 +565,7 @@ print("Async bulk parsing cancelled.")
|
|
566 |
## Add chunk
|
567 |
|
568 |
```python
|
569 |
-
Document.add_chunk(content:str) -> Chunk
|
570 |
```
|
571 |
|
572 |
Adds a chunk to the current document.
|
@@ -577,7 +576,7 @@ Adds a chunk to the current document.
|
|
577 |
|
578 |
The text content of the chunk.
|
579 |
|
580 |
-
#### important_keywords: `list[str]`
|
581 |
|
582 |
The key terms or phrases to tag with the chunk.
|
583 |
|
@@ -588,7 +587,7 @@ The key terms or phrases to tag with the chunk.
|
|
588 |
|
589 |
A `Chunk` object contains the following attributes:
|
590 |
|
591 |
-
- `id`: `str`
|
592 |
- `content`: `str` Content of the chunk.
|
593 |
- `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
|
594 |
- `create_time`: `str` The time when the chunk was created (added to the document).
|
@@ -596,9 +595,9 @@ A `Chunk` object contains the following attributes:
|
|
596 |
- `knowledgebase_id`: `str` The ID of the associated dataset.
|
597 |
- `document_name`: `str` The name of the associated document.
|
598 |
- `document_id`: `str` The ID of the associated document.
|
599 |
-
- `available`: `
|
600 |
-
- `
|
601 |
-
- `
|
602 |
|
603 |
|
604 |
### Examples
|
@@ -619,26 +618,26 @@ chunk = doc.add_chunk(content="xxxxxxx")
|
|
619 |
## List chunks
|
620 |
|
621 |
```python
|
622 |
-
Document.list_chunks(keywords: str = None, offset: int =
|
623 |
```
|
624 |
|
625 |
-
|
626 |
|
627 |
### Parameters
|
628 |
|
629 |
-
#### keywords: `str`
|
630 |
|
631 |
The keywords used to match chunk content. Defaults to `None`
|
632 |
|
633 |
#### offset: `int`
|
634 |
|
635 |
-
The starting index for the chunks to retrieve. Defaults to `1
|
636 |
|
637 |
-
#### limit
|
638 |
|
639 |
-
The maximum number of chunks to retrieve. Default: `
|
640 |
|
641 |
-
#### id
|
642 |
|
643 |
The ID of the chunk to retrieve. Default: `None`
|
644 |
|
@@ -713,9 +712,9 @@ A dictionary representing the attributes to update, with the following keys:
|
|
713 |
|
714 |
- `"content"`: `str` Content of the chunk.
|
715 |
- `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
|
716 |
-
- `"available"`: `
|
717 |
-
- `
|
718 |
-
- `
|
719 |
|
720 |
### Returns
|
721 |
|
@@ -741,10 +740,10 @@ chunk.update({"content":"sdfx..."})
|
|
741 |
## Retrieve chunks
|
742 |
|
743 |
```python
|
744 |
-
RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=
|
745 |
```
|
746 |
|
747 |
-
|
748 |
|
749 |
### Parameters
|
750 |
|
@@ -752,21 +751,21 @@ RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=No
|
|
752 |
|
753 |
The user query or query keywords. Defaults to `""`.
|
754 |
|
755 |
-
#### datasets: `list[str]`, *Required
|
756 |
|
757 |
The datasets to search from.
|
758 |
|
759 |
#### document: `list[str]`
|
760 |
|
761 |
-
The documents to search from.
|
762 |
|
763 |
#### offset: `int`
|
764 |
|
765 |
-
The starting index for the documents to retrieve. Defaults to `
|
766 |
|
767 |
#### limit: `int`
|
768 |
|
769 |
-
The maximum number of chunks to retrieve. Defaults to `
|
770 |
|
771 |
#### Similarity_threshold: `float`
|
772 |
|
@@ -786,14 +785,17 @@ The ID of the rerank model. Defaults to `None`.
|
|
786 |
|
787 |
#### keyword: `bool`
|
788 |
|
789 |
-
Indicates whether keyword-based matching
|
790 |
|
791 |
-
- `True`:
|
792 |
-
- `False`:
|
793 |
|
794 |
#### highlight: `bool`
|
795 |
|
796 |
-
Specifying whether to enable highlighting of matched terms in the results
|
|
|
|
|
|
|
797 |
|
798 |
### Returns
|
799 |
|
@@ -849,15 +851,15 @@ Creates a chat assistant.
|
|
849 |
|
850 |
The following shows the attributes of a `Chat` object:
|
851 |
|
852 |
-
#### name: `str`, *Required
|
853 |
|
854 |
-
The name of the chat assistant
|
855 |
|
856 |
#### avatar: `str`
|
857 |
|
858 |
Base64 encoding of the avatar. Defaults to `""`.
|
859 |
|
860 |
-
#### knowledgebases: `list[str]`
|
861 |
|
862 |
The IDs of the associated datasets. Defaults to `[""]`.
|
863 |
|
@@ -1016,7 +1018,7 @@ RAGFlow.list_chats(
|
|
1016 |
) -> list[Chat]
|
1017 |
```
|
1018 |
|
1019 |
-
|
1020 |
|
1021 |
### Parameters
|
1022 |
|
|
|
17 |
name: str,
|
18 |
avatar: str = "",
|
19 |
description: str = "",
|
20 |
+
embedding_model: str = "BAAI/bge-zh-v1.5",
|
21 |
language: str = "English",
|
22 |
permission: str = "me",
|
|
|
|
|
23 |
chunk_method: str = "naive",
|
24 |
parser_config: DataSet.ParserConfig = None
|
25 |
) -> DataSet
|
|
|
142 |
) -> list[DataSet]
|
143 |
```
|
144 |
|
145 |
+
Lists datasets.
|
146 |
|
147 |
### Parameters
|
148 |
|
|
|
295 |
|
296 |
A dictionary representing the attributes to update, with the following keys:
|
297 |
|
298 |
+
- `"display_name"`: `str` The name of the document to update.
|
299 |
- `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
|
300 |
- `"chunk_token_count"`: Defaults to `128`.
|
301 |
- `"layout_recognize"`: Defaults to `True`.
|
|
|
369 |
Dataset.list_documents(id:str =None, keywords: str=None, offset: int=0, limit:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
|
370 |
```
|
371 |
|
372 |
+
Lists documents in the current dataset.
|
373 |
|
374 |
### Parameters
|
375 |
|
|
|
387 |
|
388 |
#### limit: `int`
|
389 |
|
390 |
+
The maximum number of documents to retrieve. Defaults to `1024`.
|
391 |
|
392 |
#### orderby: `str`
|
393 |
|
|
|
411 |
- `name`: The document name. Defaults to `""`.
|
412 |
- `thumbnail`: The thumbnail image of the document. Defaults to `None`.
|
413 |
- `knowledgebase_id`: The dataset ID associated with the document. Defaults to `None`.
|
414 |
+
- `chunk_method` The chunk method name. Defaults to `"naive"`.
|
415 |
- `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `{"pages": [[1, 1000000]]}`.
|
416 |
- `source_type`: The source type of the document. Defaults to `"local"`.
|
417 |
- `type`: Type or category of the document. Defaults to `""`. Reserved for future use.
|
|
|
424 |
- `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
|
425 |
- `process_duation`: `float` Duration of the processing in seconds. Defaults to `0.0`.
|
426 |
- `run`: `str` The document's processing status:
|
427 |
+
- `"0"`: UNSTART (default) ?????????
|
428 |
- `"1"`: RUNNING
|
429 |
- `"2"`: CANCEL
|
430 |
- `"3"`: DONE
|
|
|
505 |
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
506 |
dataset = rag_object.create_dataset(name="dataset_name")
|
507 |
documents = [
|
508 |
+
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
|
509 |
+
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
|
510 |
+
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
|
511 |
]
|
512 |
dataset.upload_documents(documents)
|
513 |
documents = dataset.list_documents(keywords="test")
|
|
|
545 |
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
546 |
dataset = rag_object.create_dataset(name="dataset_name")
|
547 |
documents = [
|
548 |
+
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
|
549 |
+
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
|
550 |
+
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
|
551 |
]
|
552 |
dataset.upload_documents(documents)
|
553 |
documents = dataset.list_documents(keywords="test")
|
|
|
565 |
## Add chunk
|
566 |
|
567 |
```python
|
568 |
+
Document.add_chunk(content:str, important_keywords:list[str] = []) -> Chunk
|
569 |
```
|
570 |
|
571 |
Adds a chunk to the current document.
|
|
|
576 |
|
577 |
The text content of the chunk.
|
578 |
|
579 |
+
#### important_keywords: `list[str]`
|
580 |
|
581 |
The key terms or phrases to tag with the chunk.
|
582 |
|
|
|
587 |
|
588 |
A `Chunk` object contains the following attributes:
|
589 |
|
590 |
+
- `id`: `str`
|
591 |
- `content`: `str` Content of the chunk.
|
592 |
- `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
|
593 |
- `create_time`: `str` The time when the chunk was created (added to the document).
|
|
|
595 |
- `knowledgebase_id`: `str` The ID of the associated dataset.
|
596 |
- `document_name`: `str` The name of the associated document.
|
597 |
- `document_id`: `str` The ID of the associated document.
|
598 |
+
- `available`: `bool` The chunk's availability status in the dataset. Value options:
|
599 |
+
- `False`: Unavailable
|
600 |
+
- `True`: Available
|
601 |
|
602 |
|
603 |
### Examples
|
|
|
618 |
## List chunks
|
619 |
|
620 |
```python
|
621 |
+
Document.list_chunks(keywords: str = None, offset: int = 1, limit: int = 1024, id : str = None) -> list[Chunk]
|
622 |
```
|
623 |
|
624 |
+
Lists chunks in the current document.
|
625 |
|
626 |
### Parameters
|
627 |
|
628 |
+
#### keywords: `str`
|
629 |
|
630 |
The keywords used to match chunk content. Defaults to `None`
|
631 |
|
632 |
#### offset: `int`
|
633 |
|
634 |
+
The starting index for the chunks to retrieve. Defaults to `1`.
|
635 |
|
636 |
+
#### limit: `int`
|
637 |
|
638 |
+
The maximum number of chunks to retrieve. Default: `1024`
|
639 |
|
640 |
+
#### id: `str`
|
641 |
|
642 |
The ID of the chunk to retrieve. Default: `None`
|
643 |
|
|
|
712 |
|
713 |
- `"content"`: `str` Content of the chunk.
|
714 |
- `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
|
715 |
+
- `"available"`: `bool` The chunk's availability status in the dataset. Value options:
|
716 |
+
- `False`: Unavailable
|
717 |
+
- `True`: Available
|
718 |
|
719 |
### Returns
|
720 |
|
|
|
740 |
## Retrieve chunks
|
741 |
|
742 |
```python
|
743 |
+
RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=1024, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
|
744 |
```
|
745 |
|
746 |
+
Retrieves chunks from specified datasets.
|
747 |
|
748 |
### Parameters
|
749 |
|
|
|
751 |
|
752 |
The user query or query keywords. Defaults to `""`.
|
753 |
|
754 |
+
#### datasets: `list[str]`, *Required*
|
755 |
|
756 |
The datasets to search from.
|
757 |
|
758 |
#### document: `list[str]`
|
759 |
|
760 |
+
The documents to search from. Defaults to `None`.
|
761 |
|
762 |
#### offset: `int`
|
763 |
|
764 |
+
The starting index for the documents to retrieve. Defaults to `1`.
|
765 |
|
766 |
#### limit: `int`
|
767 |
|
768 |
+
The maximum number of chunks to retrieve. Defaults to `1024`.
|
769 |
|
770 |
#### Similarity_threshold: `float`
|
771 |
|
|
|
785 |
|
786 |
#### keyword: `bool`
|
787 |
|
788 |
+
Indicates whether to enable keyword-based matching:
|
789 |
|
790 |
+
- `True`: Enable keyword-based matching.
|
791 |
+
- `False`: Disable keyword-based matching (default).
|
792 |
|
793 |
#### highlight: `bool`
|
794 |
|
795 |
+
Specifying whether to enable highlighting of matched terms in the results:
|
796 |
+
|
797 |
+
- `True`: Enable highlighting of matched terms.
|
798 |
+
- `False`: Disable highlighting of matched terms (default).
|
799 |
|
800 |
### Returns
|
801 |
|
|
|
851 |
|
852 |
The following shows the attributes of a `Chat` object:
|
853 |
|
854 |
+
#### name: `str`, *Required*
|
855 |
|
856 |
+
The name of the chat assistant..
|
857 |
|
858 |
#### avatar: `str`
|
859 |
|
860 |
Base64 encoding of the avatar. Defaults to `""`.
|
861 |
|
862 |
+
#### knowledgebases: `list[str]`
|
863 |
|
864 |
The IDs of the associated datasets. Defaults to `[""]`.
|
865 |
|
|
|
1018 |
) -> list[Chat]
|
1019 |
```
|
1020 |
|
1021 |
+
Lists chat assistants.
|
1022 |
|
1023 |
### Parameters
|
1024 |
|