writinwaters commited on
Commit
5ec7450
·
1 Parent(s): c31ab66

DRAFT: Updated python and http api references (#2973)

Browse files

### What problem does this PR solve?


### Type of change

- [x] Documentation Update

api/http_api_reference.md CHANGED
@@ -20,7 +20,7 @@ Creates a dataset.
20
  ### Request
21
 
22
  - Method: POST
23
- - URL: `http://{address}/api/v1/dataset`
24
  - Headers:
25
  - `'content-Type: application/json'`
26
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -163,7 +163,7 @@ Deletes datasets by ID.
163
  ### Request
164
 
165
  - Method: DELETE
166
- - URL: `http://{address}/api/v1/dataset`
167
  - Headers:
168
  - `'content-Type: application/json'`
169
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -219,7 +219,7 @@ Updates configurations for a specified dataset.
219
  ### Request
220
 
221
  - Method: PUT
222
- - URL: `http://{address}/api/v1/dataset/{dataset_id}`
223
  - Headers:
224
  - `'content-Type: application/json'`
225
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -243,8 +243,6 @@ curl --request PUT \
243
  --data '{
244
  "name": "test",
245
  "embedding_model": "BAAI/bge-zh-v1.5",
246
- "chunk_count": 0,
247
- "document_count": 0,
248
  "parse_method": "naive"
249
  }'
250
  ```
@@ -293,14 +291,12 @@ An error response includes a JSON object like the following:
293
 
294
  **GET** `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
295
 
296
- Lists all datasets?????
297
-
298
- Retrieves a list of datasets.
299
 
300
  ### Request
301
 
302
  - Method: GET
303
- - URL: `http://{address}/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
304
  - Headers:
305
  - `'Authorization: Bearer {YOUR_API_KEY}'`
306
 
@@ -407,10 +403,10 @@ Uploads documents to a specified dataset.
407
  - Method: POST
408
  - URL: `/api/v1/dataset/{dataset_id}/document`
409
  - Headers:
410
- - 'Content-Type: multipart/form-data'
411
  - `'Authorization: Bearer {YOUR_API_KEY}'`
412
  - Form:
413
- - 'file=@{FILE_PATH}'
414
 
415
  #### Request example
416
 
@@ -425,9 +421,9 @@ curl --request POST \
425
  #### Request parameters
426
 
427
  - `"dataset_id"`: (*Path parameter*)
428
- The dataset ID.
429
  - `"file"`: (*Body parameter*)
430
- The file to upload.
431
 
432
  ### Response
433
 
@@ -459,25 +455,25 @@ Updates configurations for a specified document.
459
  ### Request
460
 
461
  - Method: PUT
462
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}`
463
  - Headers:
464
  - `'content-Type: application/json'`
465
  - `'Authorization: Bearer {YOUR_API_KEY}'`
466
  - Body:
467
- - `name`:`string`
468
- - `parser_method`:`string`
469
- - `parser_config`:`dict`
470
 
471
  #### Request example
472
 
473
  ```bash
474
  curl --request PUT \
475
  --url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
476
- --header 'Authorization: Bearer {YOUR_ACCESS TOKEN}' \
477
  --header 'Content-Type: application/json' \
478
  --data '{
479
  "name": "manual.txt",
480
- "parser_method": "manual",
481
  "parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12}
482
  }'
483
 
@@ -485,8 +481,24 @@ curl --request PUT \
485
 
486
  #### Request parameters
487
 
488
- - `"parser_method"`: (*Body parameter*)
489
- Method used to parse the document.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
490
 
491
  - `"parser_config"`: (*Body parameter*)
492
  Configuration object for the parser.
@@ -525,7 +537,7 @@ Downloads a document from a specified dataset.
525
  ### Request
526
 
527
  - Method: GET
528
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}`
529
  - Headers:
530
  - `'Authorization: Bearer {YOUR_API_KEY}'`
531
  - Output:
@@ -570,7 +582,7 @@ An error response includes a JSON object like the following:
570
 
571
  **GET** `/api/v1/dataset/{dataset_id}/info?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}`
572
 
573
- Retrieves a list of documents from a specified dataset.
574
 
575
  ### Request
576
 
@@ -670,7 +682,7 @@ Deletes documents by ID.
670
  ### Request
671
 
672
  - Method: DELETE
673
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document`
674
  - Headers:
675
  - `'Content-Type: application/json'`
676
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -724,7 +736,7 @@ Parses documents in a specified dataset.
724
  ### Request
725
 
726
  - Method: POST
727
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/chunk `
728
  - Headers:
729
  - `'content-Type: application/json'`
730
  - 'Authorization: Bearer {YOUR_API_KEY}'
@@ -777,7 +789,7 @@ Stops parsing specified documents.
777
  ### Request
778
 
779
  - Method: DELETE
780
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/chunk`
781
  - Headers:
782
  - `'content-Type: application/json'`
783
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -831,7 +843,7 @@ Adds a chunk to a specified document in a specified dataset.
831
  ### Request
832
 
833
  - Method: POST
834
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
835
  - Headers:
836
  - `'content-Type: application/json'`
837
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -896,12 +908,12 @@ An error response includes a JSON object like the following:
896
 
897
  **GET** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
898
 
899
- Retrieves a list of chunks from a specified document in a specified dataset.
900
 
901
  ### Request
902
 
903
  - Method: GET
904
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
905
  - Headers:
906
  - `'Authorization: Bearer {YOUR_API_KEY}'`
907
 
@@ -992,7 +1004,7 @@ Deletes chunks by ID.
992
  ### Request
993
 
994
  - Method: DELETE
995
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
996
  - Headers:
997
  - `'content-Type: application/json'`
998
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1046,7 +1058,7 @@ Updates content or configurations for a specified chunk.
1046
  ### Request
1047
 
1048
  - Method: PUT
1049
- - URL: `http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk/{chunk_id}`
1050
  - Headers:
1051
  - `'content-Type: application/json'`
1052
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1102,12 +1114,12 @@ An error response includes a JSON object like the following:
1102
 
1103
  **GET** `/api/v1/retrieval`
1104
 
1105
- Retrieval test of a dataset
1106
 
1107
  ### Request
1108
 
1109
  - Method: POST
1110
- - URL: `http://{address}/api/v1/retrieval`
1111
  - Headers:
1112
  - `'content-Type: application/json'`
1113
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1252,7 +1264,7 @@ Creates a chat assistant.
1252
  ### Request
1253
 
1254
  - Method: POST
1255
- - URL: `http://{address}/api/v1/chat`
1256
  - Headers:
1257
  - `'content-Type: application/json'`
1258
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1486,7 +1498,7 @@ Updates configurations for a specified chat assistant.
1486
  ### Request
1487
 
1488
  - Method: PUT
1489
- - URL: `http://{address}/api/v1/chat/{chat_id}`
1490
  - Headers:
1491
  - `'content-Type: application/json'`
1492
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1538,7 +1550,7 @@ Deletes chat assistants by ID.
1538
  ### Request
1539
 
1540
  - Method: DELETE
1541
- - URL: `http://{address}/api/v1/chat`
1542
  - Headers:
1543
  - `'content-Type: application/json'`
1544
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1586,16 +1598,16 @@ An error response includes a JSON object like the following:
1586
 
1587
  ---
1588
 
1589
- ## List chats (INCONSISTENT WITH THE PYTHON API)
1590
 
1591
- **GET** `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
1592
 
1593
- Retrieves a list of chat assistants.
1594
 
1595
  ### Request
1596
 
1597
  - Method: GET
1598
- - URL: `http://{address}/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
1599
  - Headers:
1600
  - `'Authorization: Bearer {YOUR_API_KEY}'`
1601
 
@@ -1732,7 +1744,7 @@ Create a chat session.
1732
  ### Request
1733
 
1734
  - Method: POST
1735
- - URL: `http://{address}/api/v1/chat/{chat_id}/session`
1736
  - Headers:
1737
  - `'content-Type: application/json'`
1738
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1827,7 +1839,7 @@ Update a chat session
1827
  ### Request
1828
 
1829
  - Method: PUT
1830
- - URL: `http://{address}/api/v1/chat/{chat_id}/session/{session_id}`
1831
  - Headers:
1832
  - `'content-Type: application/json'`
1833
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -1882,7 +1894,7 @@ Lists sessions associated with a specified????????????? chat assistant.
1882
  ### Request
1883
 
1884
  - Method: GET
1885
- - URL: `http://{address}/api/v1/chat/{chat_id}/session?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
1886
  - Headers:
1887
  - `'Authorization: Bearer {YOUR_API_KEY}'`
1888
 
@@ -1967,7 +1979,7 @@ Deletes sessions by ID.
1967
  ### Request
1968
 
1969
  - Method: DELETE
1970
- - URL: `http://{address}/api/v1/chat/{chat_id}/session`
1971
  - Headers:
1972
  - `'content-Type: application/json'`
1973
  - `'Authorization: Bearer {YOUR_API_KEY}'`
@@ -2023,7 +2035,7 @@ Asks a question to start a conversation.
2023
  ### Request
2024
 
2025
  - Method: POST
2026
- - URL: `http://{address}/api/v1/chat/{chat_id}/completion`
2027
  - Headers:
2028
  - `'content-Type: application/json'`
2029
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
20
  ### Request
21
 
22
  - Method: POST
23
+ - URL: `/api/v1/dataset`
24
  - Headers:
25
  - `'content-Type: application/json'`
26
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
163
  ### Request
164
 
165
  - Method: DELETE
166
+ - URL: `/api/v1/dataset`
167
  - Headers:
168
  - `'content-Type: application/json'`
169
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
219
  ### Request
220
 
221
  - Method: PUT
222
+ - URL: `/api/v1/dataset/{dataset_id}`
223
  - Headers:
224
  - `'content-Type: application/json'`
225
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
243
  --data '{
244
  "name": "test",
245
  "embedding_model": "BAAI/bge-zh-v1.5",
 
 
246
  "parse_method": "naive"
247
  }'
248
  ```
 
291
 
292
  **GET** `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
293
 
294
+ Lists datasets.
 
 
295
 
296
  ### Request
297
 
298
  - Method: GET
299
+ - URL: `/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
300
  - Headers:
301
  - `'Authorization: Bearer {YOUR_API_KEY}'`
302
 
 
403
  - Method: POST
404
  - URL: `/api/v1/dataset/{dataset_id}/document`
405
  - Headers:
406
+ - `'Content-Type: multipart/form-data'`
407
  - `'Authorization: Bearer {YOUR_API_KEY}'`
408
  - Form:
409
+ - `'file=@{FILE_PATH}'`
410
 
411
  #### Request example
412
 
 
421
  #### Request parameters
422
 
423
  - `"dataset_id"`: (*Path parameter*)
424
+ The ID of the dataset to which the documents will be uploaded.
425
  - `"file"`: (*Body parameter*)
426
+ The document???? to upload.
427
 
428
  ### Response
429
 
 
455
  ### Request
456
 
457
  - Method: PUT
458
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}`
459
  - Headers:
460
  - `'content-Type: application/json'`
461
  - `'Authorization: Bearer {YOUR_API_KEY}'`
462
  - Body:
463
+ - `"name"`:`string`
464
+ - `"chunk_method"`:`string`
465
+ - `"parser_config"`:`dict`
466
 
467
  #### Request example
468
 
469
  ```bash
470
  curl --request PUT \
471
  --url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
472
+ --header 'Authorization: Bearer {YOUR_API_KEY}' \
473
  --header 'Content-Type: application/json' \
474
  --data '{
475
  "name": "manual.txt",
476
+ "chunk_method": "manual",
477
  "parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12}
478
  }'
479
 
 
481
 
482
  #### Request parameters
483
 
484
+ - `"name"`: (*Body parameter*), `string`
485
+ - `"chunk_method"`: (*Body parameter*), `string`
486
+ The parsing method to apply to the document.
487
+ - `"naive"`: General
488
+ - `"manual`: Manual
489
+ - `"qa"`: Q&A
490
+ - `"table"`: Table
491
+ - `"paper"`: Paper
492
+ - `"book"`: Book
493
+ - `"laws"`: Laws
494
+ - `"presentation"`: Presentation
495
+ - `"picture"`: Picture
496
+ - `"one"`: One
497
+ - `"knowledge_graph"`: Knowledge Graph
498
+ - `"email"`: Email
499
+ -
500
+
501
+ ### Returns
502
 
503
  - `"parser_config"`: (*Body parameter*)
504
  Configuration object for the parser.
 
537
  ### Request
538
 
539
  - Method: GET
540
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}`
541
  - Headers:
542
  - `'Authorization: Bearer {YOUR_API_KEY}'`
543
  - Output:
 
582
 
583
  **GET** `/api/v1/dataset/{dataset_id}/info?offset={offset}&limit={limit}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}`
584
 
585
+ Lists documents in a specified dataset.
586
 
587
  ### Request
588
 
 
682
  ### Request
683
 
684
  - Method: DELETE
685
+ - URL: `/api/v1/dataset/{dataset_id}/document`
686
  - Headers:
687
  - `'Content-Type: application/json'`
688
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
736
  ### Request
737
 
738
  - Method: POST
739
+ - URL: `/api/v1/dataset/{dataset_id}/chunk `
740
  - Headers:
741
  - `'content-Type: application/json'`
742
  - 'Authorization: Bearer {YOUR_API_KEY}'
 
789
  ### Request
790
 
791
  - Method: DELETE
792
+ - URL: `/api/v1/dataset/{dataset_id}/chunk`
793
  - Headers:
794
  - `'content-Type: application/json'`
795
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
843
  ### Request
844
 
845
  - Method: POST
846
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
847
  - Headers:
848
  - `'content-Type: application/json'`
849
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
908
 
909
  **GET** `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
910
 
911
+ Lists chunks in a specified document.
912
 
913
  ### Request
914
 
915
  - Method: GET
916
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk?keywords={keywords}&offset={offset}&limit={limit}&id={id}`
917
  - Headers:
918
  - `'Authorization: Bearer {YOUR_API_KEY}'`
919
 
 
1004
  ### Request
1005
 
1006
  - Method: DELETE
1007
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk`
1008
  - Headers:
1009
  - `'content-Type: application/json'`
1010
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1058
  ### Request
1059
 
1060
  - Method: PUT
1061
+ - URL: `/api/v1/dataset/{dataset_id}/document/{document_id}/chunk/{chunk_id}`
1062
  - Headers:
1063
  - `'content-Type: application/json'`
1064
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1114
 
1115
  **GET** `/api/v1/retrieval`
1116
 
1117
+ Retrieves chunks from specified datasets.
1118
 
1119
  ### Request
1120
 
1121
  - Method: POST
1122
+ - URL: `/api/v1/retrieval`
1123
  - Headers:
1124
  - `'content-Type: application/json'`
1125
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1264
  ### Request
1265
 
1266
  - Method: POST
1267
+ - URL: `/api/v1/chat`
1268
  - Headers:
1269
  - `'content-Type: application/json'`
1270
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1498
  ### Request
1499
 
1500
  - Method: PUT
1501
+ - URL: `/api/v1/chat/{chat_id}`
1502
  - Headers:
1503
  - `'content-Type: application/json'`
1504
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1550
  ### Request
1551
 
1552
  - Method: DELETE
1553
+ - URL: `/api/v1/chat`
1554
  - Headers:
1555
  - `'content-Type: application/json'`
1556
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1598
 
1599
  ---
1600
 
1601
+ ## List chats
1602
 
1603
+ **GET** `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={chat_name}&id={chat_id}`
1604
 
1605
+ Lists chat assistants.
1606
 
1607
  ### Request
1608
 
1609
  - Method: GET
1610
+ - URL: `/api/v1/chat?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
1611
  - Headers:
1612
  - `'Authorization: Bearer {YOUR_API_KEY}'`
1613
 
 
1744
  ### Request
1745
 
1746
  - Method: POST
1747
+ - URL: `/api/v1/chat/{chat_id}/session`
1748
  - Headers:
1749
  - `'content-Type: application/json'`
1750
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1839
  ### Request
1840
 
1841
  - Method: PUT
1842
+ - URL: `/api/v1/chat/{chat_id}/session/{session_id}`
1843
  - Headers:
1844
  - `'content-Type: application/json'`
1845
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
1894
  ### Request
1895
 
1896
  - Method: GET
1897
+ - URL: `/api/v1/chat/{chat_id}/session?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
1898
  - Headers:
1899
  - `'Authorization: Bearer {YOUR_API_KEY}'`
1900
 
 
1979
  ### Request
1980
 
1981
  - Method: DELETE
1982
+ - URL: `/api/v1/chat/{chat_id}/session`
1983
  - Headers:
1984
  - `'content-Type: application/json'`
1985
  - `'Authorization: Bearer {YOUR_API_KEY}'`
 
2035
  ### Request
2036
 
2037
  - Method: POST
2038
+ - URL: `/api/v1/chat/{chat_id}/completion`
2039
  - Headers:
2040
  - `'content-Type: application/json'`
2041
  - `'Authorization: Bearer {YOUR_API_KEY}'`
api/python_api_reference.md CHANGED
@@ -17,10 +17,9 @@ RAGFlow.create_dataset(
17
  name: str,
18
  avatar: str = "",
19
  description: str = "",
 
20
  language: str = "English",
21
  permission: str = "me",
22
- document_count: int = 0,
23
- chunk_count: int = 0,
24
  chunk_method: str = "naive",
25
  parser_config: DataSet.ParserConfig = None
26
  ) -> DataSet
@@ -143,7 +142,7 @@ RAGFlow.list_datasets(
143
  ) -> list[DataSet]
144
  ```
145
 
146
- Retrieves a list of datasets.
147
 
148
  ### Parameters
149
 
@@ -296,7 +295,7 @@ Updates configurations for the current document.
296
 
297
  A dictionary representing the attributes to update, with the following keys:
298
 
299
- - `"name"`: `str` The name of the document to update.
300
  - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
301
  - `"chunk_token_count"`: Defaults to `128`.
302
  - `"layout_recognize"`: Defaults to `True`.
@@ -370,7 +369,7 @@ print(doc)
370
  Dataset.list_documents(id:str =None, keywords: str=None, offset: int=0, limit:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
371
  ```
372
 
373
- Retrieves a list of documents from the current dataset.
374
 
375
  ### Parameters
376
 
@@ -388,7 +387,7 @@ The starting index for the documents to retrieve. Typically used in confunction
388
 
389
  #### limit: `int`
390
 
391
- The maximum number of documents to retrieve. Defaults to `1024`. A value of `-1` indicates that all documents should be returned.
392
 
393
  #### orderby: `str`
394
 
@@ -412,7 +411,7 @@ A `Document` object contains the following attributes:
412
  - `name`: The document name. Defaults to `""`.
413
  - `thumbnail`: The thumbnail image of the document. Defaults to `None`.
414
  - `knowledgebase_id`: The dataset ID associated with the document. Defaults to `None`.
415
- - `chunk_method` The chunk method name. Defaults to `""`. ?????naive??????
416
  - `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `{"pages": [[1, 1000000]]}`.
417
  - `source_type`: The source type of the document. Defaults to `"local"`.
418
  - `type`: Type or category of the document. Defaults to `""`. Reserved for future use.
@@ -425,7 +424,7 @@ A `Document` object contains the following attributes:
425
  - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
426
  - `process_duation`: `float` Duration of the processing in seconds. Defaults to `0.0`.
427
  - `run`: `str` The document's processing status:
428
- - `"0"`: UNSTART (default)
429
  - `"1"`: RUNNING
430
  - `"2"`: CANCEL
431
  - `"3"`: DONE
@@ -506,9 +505,9 @@ The IDs of the documents to parse.
506
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
507
  dataset = rag_object.create_dataset(name="dataset_name")
508
  documents = [
509
- {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
510
- {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
511
- {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
512
  ]
513
  dataset.upload_documents(documents)
514
  documents = dataset.list_documents(keywords="test")
@@ -546,9 +545,9 @@ The IDs of the documents for which parsing should be stopped.
546
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
547
  dataset = rag_object.create_dataset(name="dataset_name")
548
  documents = [
549
- {'name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
550
- {'name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
551
- {'name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
552
  ]
553
  dataset.upload_documents(documents)
554
  documents = dataset.list_documents(keywords="test")
@@ -566,7 +565,7 @@ print("Async bulk parsing cancelled.")
566
  ## Add chunk
567
 
568
  ```python
569
- Document.add_chunk(content:str) -> Chunk ?????????????????????
570
  ```
571
 
572
  Adds a chunk to the current document.
@@ -577,7 +576,7 @@ Adds a chunk to the current document.
577
 
578
  The text content of the chunk.
579
 
580
- #### important_keywords: `list[str]` ??????????????????????
581
 
582
  The key terms or phrases to tag with the chunk.
583
 
@@ -588,7 +587,7 @@ The key terms or phrases to tag with the chunk.
588
 
589
  A `Chunk` object contains the following attributes:
590
 
591
- - `id`: `str`
592
  - `content`: `str` Content of the chunk.
593
  - `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
594
  - `create_time`: `str` The time when the chunk was created (added to the document).
@@ -596,9 +595,9 @@ A `Chunk` object contains the following attributes:
596
  - `knowledgebase_id`: `str` The ID of the associated dataset.
597
  - `document_name`: `str` The name of the associated document.
598
  - `document_id`: `str` The ID of the associated document.
599
- - `available`: `int`???? The chunk's availability status in the dataset. Value options:
600
- - `0`: Unavailable
601
- - `1`: Available
602
 
603
 
604
  ### Examples
@@ -619,26 +618,26 @@ chunk = doc.add_chunk(content="xxxxxxx")
619
  ## List chunks
620
 
621
  ```python
622
- Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
623
  ```
624
 
625
- Retrieves a list of chunks from the current document.
626
 
627
  ### Parameters
628
 
629
- #### keywords: `str`
630
 
631
  The keywords used to match chunk content. Defaults to `None`
632
 
633
  #### offset: `int`
634
 
635
- The starting index for the chunks to retrieve. Defaults to `1`??????
636
 
637
- #### limit
638
 
639
- The maximum number of chunks to retrieve. Default: `30`?????????
640
 
641
- #### id
642
 
643
  The ID of the chunk to retrieve. Default: `None`
644
 
@@ -713,9 +712,9 @@ A dictionary representing the attributes to update, with the following keys:
713
 
714
  - `"content"`: `str` Content of the chunk.
715
  - `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
716
- - `"available"`: `int` The chunk's availability status in the dataset. Value options:
717
- - `0`: Unavailable
718
- - `1`: Available
719
 
720
  ### Returns
721
 
@@ -741,10 +740,10 @@ chunk.update({"content":"sdfx..."})
741
  ## Retrieve chunks
742
 
743
  ```python
744
- RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
745
  ```
746
 
747
- ???????
748
 
749
  ### Parameters
750
 
@@ -752,21 +751,21 @@ RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=No
752
 
753
  The user query or query keywords. Defaults to `""`.
754
 
755
- #### datasets: `list[str]`, *Required*?????
756
 
757
  The datasets to search from.
758
 
759
  #### document: `list[str]`
760
 
761
- The documents to search from. `None` means no limitation. Defaults to `None`.
762
 
763
  #### offset: `int`
764
 
765
- The starting index for the documents to retrieve. Defaults to `0`??????.
766
 
767
  #### limit: `int`
768
 
769
- The maximum number of chunks to retrieve. Defaults to `6`.???????????????
770
 
771
  #### Similarity_threshold: `float`
772
 
@@ -786,14 +785,17 @@ The ID of the rerank model. Defaults to `None`.
786
 
787
  #### keyword: `bool`
788
 
789
- Indicates whether keyword-based matching is enabled:
790
 
791
- - `True`: Enabled.
792
- - `False`: Disabled (default).
793
 
794
  #### highlight: `bool`
795
 
796
- Specifying whether to enable highlighting of matched terms in the results (True) or not (False).
 
 
 
797
 
798
  ### Returns
799
 
@@ -849,15 +851,15 @@ Creates a chat assistant.
849
 
850
  The following shows the attributes of a `Chat` object:
851
 
852
- #### name: `str`, *Required*????????
853
 
854
- The name of the chat assistant. Defaults to `"assistant"`.
855
 
856
  #### avatar: `str`
857
 
858
  Base64 encoding of the avatar. Defaults to `""`.
859
 
860
- #### knowledgebases: `list[str]`
861
 
862
  The IDs of the associated datasets. Defaults to `[""]`.
863
 
@@ -1016,7 +1018,7 @@ RAGFlow.list_chats(
1016
  ) -> list[Chat]
1017
  ```
1018
 
1019
- Retrieves a list of chat assistants.
1020
 
1021
  ### Parameters
1022
 
 
17
  name: str,
18
  avatar: str = "",
19
  description: str = "",
20
+ embedding_model: str = "BAAI/bge-zh-v1.5",
21
  language: str = "English",
22
  permission: str = "me",
 
 
23
  chunk_method: str = "naive",
24
  parser_config: DataSet.ParserConfig = None
25
  ) -> DataSet
 
142
  ) -> list[DataSet]
143
  ```
144
 
145
+ Lists datasets.
146
 
147
  ### Parameters
148
 
 
295
 
296
  A dictionary representing the attributes to update, with the following keys:
297
 
298
+ - `"display_name"`: `str` The name of the document to update.
299
  - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document:
300
  - `"chunk_token_count"`: Defaults to `128`.
301
  - `"layout_recognize"`: Defaults to `True`.
 
369
  Dataset.list_documents(id:str =None, keywords: str=None, offset: int=0, limit:int = 1024,order_by:str = "create_time", desc: bool = True) -> list[Document]
370
  ```
371
 
372
+ Lists documents in the current dataset.
373
 
374
  ### Parameters
375
 
 
387
 
388
  #### limit: `int`
389
 
390
+ The maximum number of documents to retrieve. Defaults to `1024`.
391
 
392
  #### orderby: `str`
393
 
 
411
  - `name`: The document name. Defaults to `""`.
412
  - `thumbnail`: The thumbnail image of the document. Defaults to `None`.
413
  - `knowledgebase_id`: The dataset ID associated with the document. Defaults to `None`.
414
+ - `chunk_method` The chunk method name. Defaults to `"naive"`.
415
  - `parser_config`: `ParserConfig` Configuration object for the parser. Defaults to `{"pages": [[1, 1000000]]}`.
416
  - `source_type`: The source type of the document. Defaults to `"local"`.
417
  - `type`: Type or category of the document. Defaults to `""`. Reserved for future use.
 
424
  - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
425
  - `process_duation`: `float` Duration of the processing in seconds. Defaults to `0.0`.
426
  - `run`: `str` The document's processing status:
427
+ - `"0"`: UNSTART (default) ?????????
428
  - `"1"`: RUNNING
429
  - `"2"`: CANCEL
430
  - `"3"`: DONE
 
505
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
506
  dataset = rag_object.create_dataset(name="dataset_name")
507
  documents = [
508
+ {'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
509
+ {'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
510
+ {'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
511
  ]
512
  dataset.upload_documents(documents)
513
  documents = dataset.list_documents(keywords="test")
 
545
  rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
546
  dataset = rag_object.create_dataset(name="dataset_name")
547
  documents = [
548
+ {'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
549
+ {'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
550
+ {'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
551
  ]
552
  dataset.upload_documents(documents)
553
  documents = dataset.list_documents(keywords="test")
 
565
  ## Add chunk
566
 
567
  ```python
568
+ Document.add_chunk(content:str, important_keywords:list[str] = []) -> Chunk
569
  ```
570
 
571
  Adds a chunk to the current document.
 
576
 
577
  The text content of the chunk.
578
 
579
+ #### important_keywords: `list[str]`
580
 
581
  The key terms or phrases to tag with the chunk.
582
 
 
587
 
588
  A `Chunk` object contains the following attributes:
589
 
590
+ - `id`: `str`
591
  - `content`: `str` Content of the chunk.
592
  - `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
593
  - `create_time`: `str` The time when the chunk was created (added to the document).
 
595
  - `knowledgebase_id`: `str` The ID of the associated dataset.
596
  - `document_name`: `str` The name of the associated document.
597
  - `document_id`: `str` The ID of the associated document.
598
+ - `available`: `bool` The chunk's availability status in the dataset. Value options:
599
+ - `False`: Unavailable
600
+ - `True`: Available
601
 
602
 
603
  ### Examples
 
618
  ## List chunks
619
 
620
  ```python
621
+ Document.list_chunks(keywords: str = None, offset: int = 1, limit: int = 1024, id : str = None) -> list[Chunk]
622
  ```
623
 
624
+ Lists chunks in the current document.
625
 
626
  ### Parameters
627
 
628
+ #### keywords: `str`
629
 
630
  The keywords used to match chunk content. Defaults to `None`
631
 
632
  #### offset: `int`
633
 
634
+ The starting index for the chunks to retrieve. Defaults to `1`.
635
 
636
+ #### limit: `int`
637
 
638
+ The maximum number of chunks to retrieve. Default: `1024`
639
 
640
+ #### id: `str`
641
 
642
  The ID of the chunk to retrieve. Default: `None`
643
 
 
712
 
713
  - `"content"`: `str` Content of the chunk.
714
  - `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
715
+ - `"available"`: `bool` The chunk's availability status in the dataset. Value options:
716
+ - `False`: Unavailable
717
+ - `True`: Available
718
 
719
  ### Returns
720
 
 
740
  ## Retrieve chunks
741
 
742
  ```python
743
+ RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=1024, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
744
  ```
745
 
746
+ Retrieves chunks from specified datasets.
747
 
748
  ### Parameters
749
 
 
751
 
752
  The user query or query keywords. Defaults to `""`.
753
 
754
+ #### datasets: `list[str]`, *Required*
755
 
756
  The datasets to search from.
757
 
758
  #### document: `list[str]`
759
 
760
+ The documents to search from. Defaults to `None`.
761
 
762
  #### offset: `int`
763
 
764
+ The starting index for the documents to retrieve. Defaults to `1`.
765
 
766
  #### limit: `int`
767
 
768
+ The maximum number of chunks to retrieve. Defaults to `1024`.
769
 
770
  #### Similarity_threshold: `float`
771
 
 
785
 
786
  #### keyword: `bool`
787
 
788
+ Indicates whether to enable keyword-based matching:
789
 
790
+ - `True`: Enable keyword-based matching.
791
+ - `False`: Disable keyword-based matching (default).
792
 
793
  #### highlight: `bool`
794
 
795
+ Specifying whether to enable highlighting of matched terms in the results:
796
+
797
+ - `True`: Enable highlighting of matched terms.
798
+ - `False`: Disable highlighting of matched terms (default).
799
 
800
  ### Returns
801
 
 
851
 
852
  The following shows the attributes of a `Chat` object:
853
 
854
+ #### name: `str`, *Required*
855
 
856
+ The name of the chat assistant..
857
 
858
  #### avatar: `str`
859
 
860
  Base64 encoding of the avatar. Defaults to `""`.
861
 
862
+ #### knowledgebases: `list[str]`
863
 
864
  The IDs of the associated datasets. Defaults to `[""]`.
865
 
 
1018
  ) -> list[Chat]
1019
  ```
1020
 
1021
+ Lists chat assistants.
1022
 
1023
  ### Parameters
1024