writinwaters jinhai-2012 commited on
Commit
e9c1552
·
1 Parent(s): 811d178

Updated chat APIs (#2831)

Browse files

### What problem does this PR solve?



### Type of change

- [x] Documentation Update

---------

Signed-off-by: Jin Hai <[email protected]>
Co-authored-by: Jin Hai <[email protected]>

Files changed (2) hide show
  1. api/http_api.md +3 -1
  2. api/python_api_reference.md +134 -106
api/http_api.md CHANGED
@@ -1,5 +1,7 @@
1
 
2
- # HTTP API Reference
 
 
3
 
4
  ## Create dataset
5
 
 
1
 
2
+ # DRAFT! HTTP API Reference
3
+
4
+ **THE API REFERENCES BELOW ARE STILL UNDER DEVELOPMENT.**
5
 
6
  ## Create dataset
7
 
api/python_api_reference.md CHANGED
@@ -1,5 +1,7 @@
1
  # DRAFT Python API Reference
2
 
 
 
3
  :::tip NOTE
4
  Knowledgebase APIs
5
  :::
@@ -40,6 +42,8 @@ The unique name of the dataset to create. It must adhere to the following requir
40
 
41
  Base64 encoding of the avatar. Defaults to `""`
42
 
 
 
43
  #### tenant_id: `str`
44
 
45
  The id of the tenant associated with the created dataset is used to identify different users. Defaults to `None`.
@@ -55,14 +59,7 @@ The description of the created dataset. Defaults to `""`.
55
 
56
  The language setting of the created dataset. Defaults to `"English"`. ????????????
57
 
58
- #### embedding_model: `str`
59
-
60
- The specific model used by the dataset to generate vector embeddings. Defaults to `""`.
61
-
62
- - If creating a dataset, embedding_model must not be provided.
63
- - If updating a dataset, embedding_model can't be changed.
64
-
65
- #### permission: `str`
66
 
67
  Specify who can operate on the dataset. Defaults to `"me"`.
68
 
@@ -70,36 +67,35 @@ Specify who can operate on the dataset. Defaults to `"me"`.
70
 
71
  The number of documents associated with the dataset. Defaults to `0`.
72
 
73
- - If updating a dataset, `document_count` can't be changed.
74
-
75
  #### chunk_count: `int`
76
 
77
  The number of data chunks generated or processed by the created dataset. Defaults to `0`.
78
 
79
- - If updating a dataset, chunk_count can't be changed.
80
-
81
  #### parse_method, `str`
82
 
83
- The method used by the dataset to parse and process data.
84
 
85
- - If updating parse_method in a dataset, chunk_count must be greater than 0. Defaults to `"naive"`.
86
 
87
- #### parser_config, `Dataset.ParserConfig`
88
 
89
- The configuration settings for the parser used by the dataset.
 
 
 
90
 
91
  ### Returns
92
- ```python
93
- DataSet
94
- description: dataset object
95
- ```
96
  ### Examples
97
 
98
  ```python
99
  from ragflow import RAGFlow
100
 
101
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
102
- ds = rag.create_dataset(name="kb_1")
103
  ```
104
 
105
  ---
@@ -107,28 +103,25 @@ ds = rag.create_dataset(name="kb_1")
107
  ## Delete knowledge bases
108
 
109
  ```python
110
- RAGFlow.delete_datasets(ids: List[str] = None)
111
  ```
112
- Deletes knowledge bases.
113
- ### Parameters
114
 
115
- #### ids: `List[str]`
116
 
117
- The ids of the datasets to be deleted.
 
 
118
 
 
119
 
120
  ### Returns
121
 
122
- ```python
123
- no return
124
- ```
125
 
126
  ### Examples
127
 
128
  ```python
129
- from ragflow import RAGFlow
130
-
131
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
132
  rag.delete_datasets(ids=["id_1","id_2"])
133
  ```
134
 
@@ -147,17 +140,17 @@ RAGFlow.list_datasets(
147
  ) -> List[DataSet]
148
  ```
149
 
150
- Lists all knowledge bases in the RAGFlow system.
151
 
152
  ### Parameters
153
 
154
  #### page: `int`
155
 
156
- The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched. Defaults to `1`.
157
 
158
  #### page_size: `int`
159
 
160
- The number of records to retrieve per page. This controls how many records will be included in each page. Defaults to `1024`.
161
 
162
  #### order_by: `str`
163
 
@@ -177,46 +170,71 @@ The name of the dataset to be got. Defaults to `None`.
177
 
178
  ### Returns
179
 
180
- ```python
181
- List[DataSet]
182
- description:the list of datasets.
183
- ```
184
 
185
  ### Examples
186
 
187
- ```python
188
- from ragflow import RAGFlow
189
 
190
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
191
- for ds in rag.list_datasets():
192
  print(ds)
193
  ```
194
 
195
- ---
196
 
 
 
 
 
 
 
197
 
198
- ## Update knowledge base
199
 
200
  ```python
201
  DataSet.update(update_message: dict)
202
  ```
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  ### Returns
205
 
206
- ```python
207
- no return
208
- ```
209
 
210
  ### Examples
211
 
212
  ```python
213
  from ragflow import RAGFlow
214
 
215
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
216
- ds = rag.get_dataset(name="kb_1")
217
- ds.update({"parse_method":"manual", ...}}
218
  ```
219
-
220
  ---
221
 
222
  :::tip API GROUPING
@@ -709,6 +727,8 @@ Chat APIs
709
 
710
  ## Create chat
711
 
 
 
712
  ```python
713
  RAGFlow.create_chat(
714
  name: str = "assistant",
@@ -717,41 +737,35 @@ RAGFlow.create_chat(
717
  llm: Chat.LLM = None,
718
  prompt: Chat.Prompt = None
719
  ) -> Chat
720
-
721
  ```
722
 
723
  ### Returns
724
 
725
- Chat
726
-
727
- description: assitant object.
728
 
729
  #### name: `str`
730
 
731
- The name of the created chat. Defaults to `"assistant"`.
732
 
733
  #### avatar: `str`
734
 
735
- The icon of the created chat. Defaults to `"path"`.
736
-
737
- #### knowledgebases: `List[DataSet]`
738
 
739
- Select knowledgebases associated. Defaults to `["kb1"]`.
740
 
741
- #### id: `str`
742
-
743
- The id of the created chat. Defaults to `""`.
744
 
745
  #### llm: `LLM`
746
 
747
  The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default.
748
 
749
  - **model_name**, `str`
750
- Large language chat model. If it is `None`, it will return the user's default model.
751
  - **temperature**, `float`
752
  This parameter controls the randomness of predictions by the model. A lower temperature makes the model more confident in its responses, while a higher temperature makes it more creative and diverse. Defaults to `0.1`.
753
  - **top_p**, `float`
754
- Also known as “nucleus sampling,” this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
755
  - **presence_penalty**, `float`
756
  This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`.
757
  - **frequency penalty**, `float`
@@ -761,9 +775,8 @@ The llm of the created chat. Defaults to `None`. When the value is `None`, a dic
761
 
762
  #### Prompt: `str`
763
 
764
- Instructions you need LLM to follow when LLM answers questions, like character design, answer length and answer language etc.
765
 
766
- Defaults:
767
  ```
768
  You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence "The answer you are looking for is not found in the knowledge base!" Answers need to consider chat history.
769
  Here is the knowledge base:
@@ -776,62 +789,81 @@ You are an intelligent assistant. Please summarize the content of the knowledge
776
  ```python
777
  from ragflow import RAGFlow
778
 
779
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
780
- kb = rag.get_dataset(name="kb_1")
781
- assi = rag.create_chat("Miss R", knowledgebases=[kb])
782
  ```
783
 
784
  ---
785
 
786
  ## Update chat
787
 
 
 
788
  ```python
789
  Chat.update(update_message: dict)
790
  ```
791
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
792
  ### Returns
793
 
794
- ```python
795
- no return
796
- ```
797
 
798
  ### Examples
799
 
800
  ```python
801
  from ragflow import RAGFlow
802
 
803
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
804
- kb = rag.get_knowledgebase(name="kb_1")
805
- assi = rag.create_chat("Miss R" knowledgebases=[kb])
806
- assi.update({"temperature":0.8})
 
807
  ```
808
 
809
  ---
810
 
811
  ## Delete chats
812
 
 
 
813
  ```python
814
  RAGFlow.delete_chats(ids: List[str] = None)
815
  ```
816
- ### Parameters
817
 
818
- #### ids: `str`
819
 
820
- IDs of the chats to be deleted.
821
 
 
822
 
823
  ### Returns
824
 
825
- ```python
826
- no return
827
- ```
828
 
829
  ### Examples
830
 
831
  ```python
832
  from ragflow import RAGFlow
833
 
834
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
835
  rag.delete_chats(ids=["id_1","id_2"])
836
  ```
837
 
@@ -852,47 +884,43 @@ RAGFlow.list_chats(
852
 
853
  ### Parameters
854
 
855
- #### page: `int`
856
 
857
- The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched.
858
- - `1`
859
 
860
- #### page_size: `int`
861
 
862
- The number of records to retrieve per page. This controls how many records will be included in each page.
863
- - `1024`
864
 
865
- #### orderby: `string`
866
 
867
- The field by which the records should be sorted. This specifies the attribute or column used to order the results.
868
- - `"create_time"`
869
 
870
- #### desc: `bool`
871
 
872
- A boolean flag indicating whether the sorting should be in descending order.
873
- - `True`
874
 
875
  #### id: `string`
876
 
877
- The ID of the chat to be retrieved.
878
- - `None`
879
 
880
  #### name: `string`
881
 
882
- The name of the chat to be retrieved.
883
- - `None`
884
  ### Returns
885
 
886
- A list of chat objects.
 
887
 
888
  ### Examples
889
 
890
  ```python
891
  from ragflow import RAGFlow
892
 
893
- rag = RAGFlow(api_key="xxxxxx", base_url="http://xxx.xx.xx.xxx:9380")
894
- for assi in rag.list_chats():
895
- print(assi)
896
  ```
897
 
898
  ---
 
1
  # DRAFT Python API Reference
2
 
3
+ **THE API REFERENCES BELOW ARE STILL UNDER DEVELOPMENT.**
4
+
5
  :::tip NOTE
6
  Knowledgebase APIs
7
  :::
 
42
 
43
  Base64 encoding of the avatar. Defaults to `""`
44
 
45
+ #### description
46
+
47
  #### tenant_id: `str`
48
 
49
  The id of the tenant associated with the created dataset is used to identify different users. Defaults to `None`.
 
59
 
60
  The language setting of the created dataset. Defaults to `"English"`. ????????????
61
 
62
+ #### permission
 
 
 
 
 
 
 
63
 
64
  Specify who can operate on the dataset. Defaults to `"me"`.
65
 
 
67
 
68
  The number of documents associated with the dataset. Defaults to `0`.
69
 
 
 
70
  #### chunk_count: `int`
71
 
72
  The number of data chunks generated or processed by the created dataset. Defaults to `0`.
73
 
 
 
74
  #### parse_method, `str`
75
 
76
+ The method used by the dataset to parse and process data. Defaults to `"naive"`.
77
 
78
+ #### parser_config
79
 
80
+ The parser configuration of the dataset. A `ParserConfig` object contains the following attributes:
81
 
82
+ - `chunk_token_count`: Defaults to `128`.
83
+ - `layout_recognize`: Defaults to `True`.
84
+ - `delimiter`: Defaults to `'\n!?。;!?'`.
85
+ - `task_page_size`: Defaults to `12`.
86
 
87
  ### Returns
88
+
89
+ - Success: A `dataset` object.
90
+ - Failure: `Exception`
91
+
92
  ### Examples
93
 
94
  ```python
95
  from ragflow import RAGFlow
96
 
97
+ rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
98
+ ds = rag_object.create_dataset(name="kb_1")
99
  ```
100
 
101
  ---
 
103
  ## Delete knowledge bases
104
 
105
  ```python
106
+ RAGFlow.delete_datasets(ids: list[str] = None)
107
  ```
 
 
108
 
109
+ Deletes knowledge bases by name or ID.
110
 
111
+ ### Parameters
112
+
113
+ #### ids
114
 
115
+ The IDs of the knowledge bases to delete.
116
 
117
  ### Returns
118
 
119
+ - Success: No value is returned.
120
+ - Failure: `Exception`
 
121
 
122
  ### Examples
123
 
124
  ```python
 
 
 
125
  rag.delete_datasets(ids=["id_1","id_2"])
126
  ```
127
 
 
140
  ) -> List[DataSet]
141
  ```
142
 
143
+ Retrieves a list of knowledge bases.
144
 
145
  ### Parameters
146
 
147
  #### page: `int`
148
 
149
+ The current page number to retrieve from the paginated results. Defaults to `1`.
150
 
151
  #### page_size: `int`
152
 
153
+ The number of records on each page. Defaults to `1024`.
154
 
155
  #### order_by: `str`
156
 
 
170
 
171
  ### Returns
172
 
173
+ - Success: A list of `DataSet` objects representing the retrieved knowledge bases.
174
+ - Failure: `Exception`.
 
 
175
 
176
  ### Examples
177
 
178
+ #### List all knowledge bases
 
179
 
180
+ ```python
181
+ for ds in rag_object.list_datasets():
182
  print(ds)
183
  ```
184
 
185
+ #### Retrieve a knowledge base by ID
186
 
187
+ ```python
188
+ dataset = rag_object.list_datasets(id = "id_1")
189
+ print(dataset[0])
190
+ ```
191
+
192
+ ---
193
 
194
+ ## Update knowledge base
195
 
196
  ```python
197
  DataSet.update(update_message: dict)
198
  ```
199
 
200
+ Updates the current knowledge base.
201
+
202
+ ### Parameters
203
+
204
+ #### update_message: `dict[str, str|int]`, *Required*
205
+
206
+ - `"name"`: `str` The name of the knowledge base to update.
207
+ - `"tenant_id"`: `str` The `"tenant_id` you get after calling `create_dataset()`.
208
+ - `"embedding_model"`: `str` The embedding model for generating vector embeddings.
209
+ - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
210
+ - `"parser_method"`: `str`
211
+ - `"naive"`: General
212
+ - `"manual`: Manual
213
+ - `"qa"`: Q&A
214
+ - `"table"`: Table
215
+ - `"paper"`: Paper
216
+ - `"book"`: Book
217
+ - `"laws"`: Laws
218
+ - `"presentation"`: Presentation
219
+ - `"picture"`: Picture
220
+ - `"one"`:One
221
+ - `"knowledge_graph"`: Knowledge Graph
222
+ - `"email"`: Email
223
+
224
  ### Returns
225
 
226
+ - Success: No value is returned.
227
+ - Failure: `Exception`
 
228
 
229
  ### Examples
230
 
231
  ```python
232
  from ragflow import RAGFlow
233
 
234
+ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
235
+ ds = rag.list_datasets(name="kb_1")
236
+ ds.update({"embedding_model":"BAAI/bge-zh-v1.5", "parse_method":"manual"})
237
  ```
 
238
  ---
239
 
240
  :::tip API GROUPING
 
727
 
728
  ## Create chat
729
 
730
+ Creates a chat assistant.
731
+
732
  ```python
733
  RAGFlow.create_chat(
734
  name: str = "assistant",
 
737
  llm: Chat.LLM = None,
738
  prompt: Chat.Prompt = None
739
  ) -> Chat
 
740
  ```
741
 
742
  ### Returns
743
 
744
+ - Success: A `Chat` object representing the chat assistant.
745
+ - Failure: `Exception`
 
746
 
747
  #### name: `str`
748
 
749
+ The name of the chat assistant. Defaults to `"assistant"`.
750
 
751
  #### avatar: `str`
752
 
753
+ Base64 encoding of the avatar. Defaults to `""`.
 
 
754
 
755
+ #### knowledgebases: `list[str]`
756
 
757
+ The associated knowledge bases. Defaults to `["kb1"]`.
 
 
758
 
759
  #### llm: `LLM`
760
 
761
  The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default.
762
 
763
  - **model_name**, `str`
764
+ The chat model name. If it is `None`, the user's default chat model will be returned.
765
  - **temperature**, `float`
766
  This parameter controls the randomness of predictions by the model. A lower temperature makes the model more confident in its responses, while a higher temperature makes it more creative and diverse. Defaults to `0.1`.
767
  - **top_p**, `float`
768
+ Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
769
  - **presence_penalty**, `float`
770
  This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`.
771
  - **frequency penalty**, `float`
 
775
 
776
  #### Prompt: `str`
777
 
778
+ Instructions for LLM's responses, including character design, answer length, and language. Defaults to:
779
 
 
780
  ```
781
  You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence "The answer you are looking for is not found in the knowledge base!" Answers need to consider chat history.
782
  Here is the knowledge base:
 
789
  ```python
790
  from ragflow import RAGFlow
791
 
792
+ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
793
+ knowledge_base = rag.list_datasets(name="kb_1")
794
+ assistant = rag.create_chat("Miss R", knowledgebases=knowledge_base)
795
  ```
796
 
797
  ---
798
 
799
  ## Update chat
800
 
801
+ Updates the current chat assistant.
802
+
803
  ```python
804
  Chat.update(update_message: dict)
805
  ```
806
 
807
+ ### Parameters
808
+
809
+ #### update_message: `dict[str, Any]`, *Required*
810
+
811
+ - `"name"`: `str` The name of the chat assistant to update.
812
+ - `"avatar"`: `str` Base64 encoding of the avatar. Defaults to `""`
813
+ - `"knowledgebases"`: `list[str]` Knowledge bases to update.
814
+ - `"llm"`: `dict` llm settings
815
+ - `"model_name"`, `str` The chat model name.
816
+ - `"temperature"`, `float` This parameter controls the randomness of predictions by the model.
817
+ - `"top_p"`, `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from.
818
+ - `"presence_penalty"`, `float` This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation.
819
+ - `"frequency penalty"`, `float` Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently.
820
+ - `"max_token"`, `int` This sets the maximum length of the model’s output, measured in the number of tokens (words or pieces of words).
821
+ - `"prompt"` : Instructions for LLM's responses, including character design, answer length, and language.
822
+
823
  ### Returns
824
 
825
+ - Success: No value is returned.
826
+ - Failure: `Exception`
 
827
 
828
  ### Examples
829
 
830
  ```python
831
  from ragflow import RAGFlow
832
 
833
+ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
834
+ knowledge_base = rag.list_datasets(name="kb_1")
835
+ assistant = rag.create_chat("Miss R", knowledgebases=knowledge_base)
836
+ assistant.update({"llm": {"temperature":0.8}})
837
+
838
  ```
839
 
840
  ---
841
 
842
  ## Delete chats
843
 
844
+ Deletes specified chat assistants.
845
+
846
  ```python
847
  RAGFlow.delete_chats(ids: List[str] = None)
848
  ```
 
849
 
850
+ ### Parameters
851
 
852
+ #### ids
853
 
854
+ IDs of the chat assistants to delete.
855
 
856
  ### Returns
857
 
858
+ - Success: No value is returned.
859
+ - Failure: `Exception`
 
860
 
861
  ### Examples
862
 
863
  ```python
864
  from ragflow import RAGFlow
865
 
866
+ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
867
  rag.delete_chats(ids=["id_1","id_2"])
868
  ```
869
 
 
884
 
885
  ### Parameters
886
 
887
+ #### page
888
 
889
+ The current page number to retrieve from the paginated results. Defaults to `1`.
 
890
 
891
+ #### page_size
892
 
893
+ The number of records on each page. Defaults to `1024`.
 
894
 
895
+ #### order_by
896
 
897
+ The attribute by which the results are sorted. Defaults to `"create_time"`.
 
898
 
899
+ #### desc
900
 
901
+ Indicates whether to sort the results in descending order. Defaults to `True`.
 
902
 
903
  #### id: `string`
904
 
905
+ The ID of the chat to be retrieved. Defaults to `None`.
 
906
 
907
  #### name: `string`
908
 
909
+ The name of the chat to be retrieved. Defaults to `None`.
910
+
911
  ### Returns
912
 
913
+ - Success: A list of `Chat` objects representing the retrieved knowledge bases.
914
+ - Failure: `Exception`.
915
 
916
  ### Examples
917
 
918
  ```python
919
  from ragflow import RAGFlow
920
 
921
+ rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
922
+ for assistant in rag.list_chats():
923
+ print(assistant)
924
  ```
925
 
926
  ---