Spaces:

retopara
/

ragflow

Build error

App Files Files Community

writinwaters commited on Oct 29, 2024

Commit

ce611dd

1 Parent(s): 2ace2a9

Updated HTTP API reference and Python API reference based on test results (#3090)

Browse files

### What problem does this PR solve?

### Type of change

- [x] Documentation Update

Files changed (2) hide show

api/http_api_reference.md +9 -7
api/python_api_reference.md +4 -5

api/http_api_reference.md CHANGED Viewed

@@ -94,8 +94,10 @@ curl --request POST \
   The configuration settings for the dataset parser, a JSON object containing the following attributes:
   - `"chunk_token_count"`: Defaults to `128`.
   - `"layout_recognize"`: Defaults to `true`.
   - `"delimiter"`: Defaults to `"\n!?。；！？"`.
-  - `"task_page_size"`: Defaults to `12`.
 ### Response
@@ -177,7 +179,7 @@ curl --request DELETE \
 #### Request parameters
-- `"ids"`: (*Body parameter*), `list[string]`
   The IDs of the datasets to delete. If it is not specified, all datasets will be deleted.
 ### Response
@@ -241,7 +243,7 @@ curl --request PUT \
 - `"embedding_model"`: (*Body parameter*), `string`
   The updated embedding model name.
   - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
-- `"chunk_method"`: (*Body parameter*), `enum<string>`
   The chunking method for the dataset. Available options:
   - `"naive"`: General
   - `"manual`: Manual
@@ -510,12 +512,12 @@ curl --request PUT \
   - `"one"`: One
   - `"knowledge_graph"`: Knowledge Graph
   - `"email"`: Email
-- `"parser_config"`: (*Body parameter*), `object`
   The parsing configuration for the document:
   - `"chunk_token_count"`: Defaults to `128`.
   - `"layout_recognize"`: Defaults to `true`.
   - `"delimiter"`: Defaults to `"\n!?。；！？"`.
-  - `"task_page_size"`: Defaults to `12`.
 ### Response
@@ -718,7 +720,7 @@ curl --request DELETE \
 - `dataset_id`: (*Path parameter*)
   The associated dataset ID.
-- `"ids"`: (*Body parameter*), `list[string]`
   The IDs of the documents to delete. If it is not specified, all documents in the specified dataset will be deleted.
 ### Response
@@ -1169,7 +1171,7 @@ Failure:
 ## Retrieve chunks
-**GET** `/api/v1/retrieval`
 Retrieves chunks from specified datasets.

   The configuration settings for the dataset parser, a JSON object containing the following attributes:
   - `"chunk_token_count"`: Defaults to `128`.
   - `"layout_recognize"`: Defaults to `true`.
+  - `"html4excel"`: Indicates whether to convert Excel documents into HTML format. Defaults to `false`.
   - `"delimiter"`: Defaults to `"\n!?。；！？"`.
+  - `"task_page_size"`: Defaults to `12`. For PDF only.
+  - `"raptor"`: Raptor-specific settings. Defaults to: `{"use_raptor": false}`.
 ### Response
 #### Request parameters
+- `"ids"`: (*Body parameter*), `list[string]`
   The IDs of the datasets to delete. If it is not specified, all datasets will be deleted.
 ### Response
 - `"embedding_model"`: (*Body parameter*), `string`
   The updated embedding model name.
   - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
+- `"chunk_method"`: (*Body parameter*), `enum<string>`
   The chunking method for the dataset. Available options:
   - `"naive"`: General
   - `"manual`: Manual
   - `"one"`: One
   - `"knowledge_graph"`: Knowledge Graph
   - `"email"`: Email
+- `"parser_config"`: (*Body parameter*), `object`
   The parsing configuration for the document:
   - `"chunk_token_count"`: Defaults to `128`.
   - `"layout_recognize"`: Defaults to `true`.
   - `"delimiter"`: Defaults to `"\n!?。；！？"`.
+  - `"task_page_size"`: Defaults to `12`. For PDF only.
 ### Response
 - `dataset_id`: (*Path parameter*)
   The associated dataset ID.
+- `"ids"`: (*Body parameter*), `list[string]`
   The IDs of the documents to delete. If it is not specified, all documents in the specified dataset will be deleted.
 ### Response
 ## Retrieve chunks
+**POST** `/api/v1/retrieval`
 Retrieves chunks from specified datasets.

api/python_api_reference.md CHANGED Viewed

@@ -1253,7 +1253,7 @@ Asks a question to start an AI-powered conversation.
 #### question: `str` *Required*
-The question to start an AI chat.
 #### stream: `bool`
@@ -1286,7 +1286,7 @@ A list of `Chunk` objects representing references to the message, each containin
 - `content` `str`
   The content of the chunk.
 - `image_id` `str`
-  The ID of the snapshot of the chunk.
 - `document_id` `str`
   The ID of the referenced document.
 - `document_name` `str`
@@ -1295,14 +1295,13 @@ A list of `Chunk` objects representing references to the message, each containin
   The location information of the chunk within the referenced document.
 - `dataset_id` `str`
   The ID of the dataset to which the referenced document belongs.
-- `similarity` `float`
-  A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
 - `vector_similarity` `float`
   A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
 - `term_similarity` `float`
   A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
 ### Examples
 ```python

 #### question: `str` *Required*
+The question to start an AI-powered conversation.
 #### stream: `bool`
 - `content` `str`
   The content of the chunk.
 - `image_id` `str`
+  The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
 - `document_id` `str`
   The ID of the referenced document.
 - `document_name` `str`
   The location information of the chunk within the referenced document.
 - `dataset_id` `str`
   The ID of the dataset to which the referenced document belongs.
+- `similarity` `float`
+  A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`.
 - `vector_similarity` `float`
   A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
 - `term_similarity` `float`
   A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
 ### Examples
 ```python