writinwaters
commited on
Commit
·
305b8c0
1
Parent(s):
bf4c34e
Miscellaneous edits to RAGFlow's UI (#3337)
Browse files### What problem does this PR solve?
### Type of change
- [x] Documentation Update
agent/templates/investment_advisor.json
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
{
|
2 |
"id": 8,
|
3 |
"title": "Intelligent investment advisor",
|
4 |
-
"description": "An intelligent investment advisor that
|
5 |
"canvas_type": "chatbot",
|
6 |
"dsl": {
|
7 |
"answer": [],
|
|
|
1 |
{
|
2 |
"id": 8,
|
3 |
"title": "Intelligent investment advisor",
|
4 |
+
"description": "An intelligent investment advisor that answers your financial questions using real-time domestic financial data.",
|
5 |
"canvas_type": "chatbot",
|
6 |
"dsl": {
|
7 |
"answer": [],
|
agent/templates/medical_consultation.json
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
{
|
2 |
"id": 7,
|
3 |
"title": "Medical consultation",
|
4 |
-
"description": "
|
5 |
"canvas_type": "chatbot",
|
6 |
"dsl": {
|
7 |
"answer": [],
|
|
|
1 |
{
|
2 |
"id": 7,
|
3 |
"title": "Medical consultation",
|
4 |
+
"description": "A consultant that offers medical suggestions using an internal QA dataset and PubMed search results. Note that this agent's answers are for reference only and may not be valid. The dataset can be found at https://huggingface.co/datasets/InfiniFlow/medical_QA/tree/main",
|
5 |
"canvas_type": "chatbot",
|
6 |
"dsl": {
|
7 |
"answer": [],
|
api/db/services/document_service.py
CHANGED
@@ -410,7 +410,7 @@ def queue_raptor_tasks(doc):
|
|
410 |
"doc_id": doc["id"],
|
411 |
"from_page": 0,
|
412 |
"to_page": -1,
|
413 |
-
"progress_msg": "Start to do RAPTOR (Recursive Abstractive Processing
|
414 |
}
|
415 |
|
416 |
task = new_task()
|
|
|
410 |
"doc_id": doc["id"],
|
411 |
"from_page": 0,
|
412 |
"to_page": -1,
|
413 |
+
"progress_msg": "Start to do RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval)."
|
414 |
}
|
415 |
|
416 |
task = new_task()
|
docs/configurations.md
CHANGED
@@ -136,37 +136,44 @@ If you cannot download the RAGFlow Docker image, try the following mirrors.
|
|
136 |
|
137 |
[service_conf.yaml](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml) specifies the system-level configuration for RAGFlow and is used by its API server and task executor.
|
138 |
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
|
143 |
-
|
144 |
-
|
145 |
-
|
146 |
-
|
147 |
-
|
148 |
-
|
149 |
-
|
150 |
-
|
151 |
-
- `
|
152 |
-
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
- `
|
157 |
-
|
158 |
-
|
159 |
-
|
160 |
-
|
161 |
-
|
162 |
-
|
163 |
-
|
164 |
-
|
165 |
-
|
166 |
-
|
167 |
-
|
168 |
-
|
169 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
170 |
|
171 |
:::tip NOTE
|
172 |
If you do not set the default LLM here, configure the default LLM on the **Settings** page in the RAGFlow UI.
|
|
|
136 |
|
137 |
[service_conf.yaml](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml) specifies the system-level configuration for RAGFlow and is used by its API server and task executor.
|
138 |
|
139 |
+
### `ragflow`
|
140 |
+
|
141 |
+
- `host`: The API server's IP address inside the Docker container. Defaults to `0.0.0.0`.
|
142 |
+
- `port`: The API server's serving port inside the Docker container. Defaults to `9380`.
|
143 |
+
|
144 |
+
### `mysql`
|
145 |
+
|
146 |
+
- `name`: The MySQL database name. Defaults to `rag_flow`.
|
147 |
+
- `user`: The username for MySQL.
|
148 |
+
- `password`: The password for MySQL. When updated, you must revise the `MYSQL_PASSWORD` variable in [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) accordingly.
|
149 |
+
- `port`: The MySQL serving port inside the Docker container. Defaults to `3306`.
|
150 |
+
- `max_connections`: The maximum number of concurrent connections to the MySQL database. Defaults to `100`.
|
151 |
+
- `stale_timeout`: Timeout in seconds.
|
152 |
+
|
153 |
+
### `minio`
|
154 |
+
|
155 |
+
- `user`: The username for MinIO. When updated, you must revise the `MINIO_USER` variable in [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) accordingly.
|
156 |
+
- `password`: The password for MinIO. When updated, you must revise the `MINIO_PASSWORD` variable in [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) accordingly.
|
157 |
+
- `host`: The MinIO serving IP *and* port inside the Docker container. Defaults to `minio:9000`.
|
158 |
+
|
159 |
+
### `oauth`
|
160 |
+
|
161 |
+
The OAuth configuration for signing up or signing in to RAGFlow using a third-party account. It is disabled by default. To enable this feature, uncomment the corresponding lines in **service_conf.yaml**.
|
162 |
+
|
163 |
+
- `github`: The GitHub authentication settings for your application. Visit the [Github Developer Settings](https://github.com/settings/developers) page to obtain your client_id and secret_key.
|
164 |
+
|
165 |
+
### `user_default_llm`
|
166 |
+
|
167 |
+
The default LLM to use for a new RAGFlow user. It is disabled by default. To enable this feature, uncomment the corresponding lines in **service_conf.yaml**.
|
168 |
+
|
169 |
+
- `factory`: The LLM supplier. Available options:
|
170 |
+
- `"OpenAI"`
|
171 |
+
- `"DeepSeek"`
|
172 |
+
- `"Moonshot"`
|
173 |
+
- `"Tongyi-Qianwen"`
|
174 |
+
- `"VolcEngine"`
|
175 |
+
- `"ZHIPU-AI"`
|
176 |
+
- `api_key`: The API key for the specified LLM. You will need to apply for your model API key online.
|
177 |
|
178 |
:::tip NOTE
|
179 |
If you do not set the default LLM here, configure the default LLM on the **Settings** page in the RAGFlow UI.
|
docs/guides/configure_knowledge_base.md
CHANGED
@@ -52,13 +52,13 @@ RAGFlow offers multiple chunking template to facilitate chunking files of differ
|
|
52 |
| Picture | | JPEG, JPG, PNG, TIF, GIF |
|
53 |
| One | The entire document is chunked as one. | DOCX, EXCEL, PDF, TXT |
|
54 |
|
55 |
-
You can also change the chunk template for a particular file on the **Datasets** page.
|
56 |
|
57 |

|
58 |
|
59 |
### Select embedding model
|
60 |
|
61 |
-
An embedding model
|
62 |
|
63 |
The following embedding models can be deployed locally:
|
64 |
|
|
|
52 |
| Picture | | JPEG, JPG, PNG, TIF, GIF |
|
53 |
| One | The entire document is chunked as one. | DOCX, EXCEL, PDF, TXT |
|
54 |
|
55 |
+
You can also change the chunk template for a particular file on the **Datasets** page.
|
56 |
|
57 |

|
58 |
|
59 |
### Select embedding model
|
60 |
|
61 |
+
An embedding model converts chunks into embeddings. It cannot be changed once the knowledge base has chunks. To switch to a different embedding model, You must delete all chunks in the knowledge base. The obvious reason is that we *must* ensure that files in a specific knowledge base are converted to embeddings using the *same* embedding model (ensure that they are compared in the same embedding space).
|
62 |
|
63 |
The following embedding models can be deployed locally:
|
64 |
|
web/src/locales/en.ts
CHANGED
@@ -157,14 +157,14 @@ export default {
|
|
157 |
delimiter: `Delimiter`,
|
158 |
html4excel: 'Excel to HTML',
|
159 |
html4excelTip: `Excel will be parsed into HTML table or not. If it's FALSE, every row in Excel will be formed as a chunk.`,
|
160 |
-
autoKeywords: 'Auto
|
161 |
-
autoKeywordsTip: `Extract N keywords for
|
162 |
-
autoQuestions: 'Auto
|
163 |
-
autoQuestionsTip: `Extract N questions for
|
164 |
},
|
165 |
knowledgeConfiguration: {
|
166 |
titleDescription:
|
167 |
-
'Update your knowledge base
|
168 |
name: 'Knowledge base name',
|
169 |
photo: 'Knowledge base photo',
|
170 |
description: 'Description',
|
@@ -176,13 +176,13 @@ export default {
|
|
176 |
chunkTokenNumber: 'Chunk token number',
|
177 |
chunkTokenNumberMessage: 'Chunk token number is required',
|
178 |
embeddingModelTip:
|
179 |
-
"The
|
180 |
permissionsTip:
|
181 |
-
"If
|
182 |
chunkTokenNumberTip:
|
183 |
-
'It
|
184 |
chunkMethod: 'Chunk method',
|
185 |
-
chunkMethodTip: '
|
186 |
upload: 'Upload',
|
187 |
english: 'English',
|
188 |
chinese: 'Chinese',
|
@@ -192,11 +192,11 @@ export default {
|
|
192 |
me: 'Only me',
|
193 |
team: 'Team',
|
194 |
cancel: 'Cancel',
|
195 |
-
methodTitle: '
|
196 |
methodExamples: 'Examples',
|
197 |
methodExamplesDescription:
|
198 |
-
'The following screenshots are
|
199 |
-
dialogueExamplesTitle: 'Dialogue
|
200 |
methodEmpty:
|
201 |
'This will display a visual explanation of the knowledge base categories',
|
202 |
book: `<p>Supported file formats are <b>DOCX</b>, <b>PDF</b>, <b>TXT</b>.</p><p>
|
@@ -208,8 +208,7 @@ export default {
|
|
208 |
The chunk granularity is consistent with 'ARTICLE', and all the upper level text will be included in the chunk.
|
209 |
</p>`,
|
210 |
manual: `<p>Only <b>PDF</b> is supported.</p><p>
|
211 |
-
We assume manual has hierarchical section structure
|
212 |
-
So, the figures and tables in the same section will not be sliced apart, and chunk size might be large.
|
213 |
</p>`,
|
214 |
naive: `<p>Supported file formats are <b>DOCX, EXCEL, PPT, IMAGE, PDF, TXT, MD, JSON, EML, HTML</b>.</p>
|
215 |
<p>This method apply the naive ways to chunk files: </p>
|
@@ -292,7 +291,7 @@ Successive text will be sliced into pieces each of which is around 512 token num
|
|
292 |
Mind the entiry type you need to specify.</p>`,
|
293 |
useRaptor: 'Use RAPTOR to enhance retrieval',
|
294 |
useRaptorTip:
|
295 |
-
'Recursive Abstractive Processing for Tree-Organized Retrieval,
|
296 |
prompt: 'Prompt',
|
297 |
promptTip: 'LLM prompt used for summarization.',
|
298 |
promptMessage: 'Prompt is required',
|
|
|
157 |
delimiter: `Delimiter`,
|
158 |
html4excel: 'Excel to HTML',
|
159 |
html4excelTip: `Excel will be parsed into HTML table or not. If it's FALSE, every row in Excel will be formed as a chunk.`,
|
160 |
+
autoKeywords: 'Auto-keyword',
|
161 |
+
autoKeywordsTip: `Extract N keywords for each chunk to improve their ranking for queries containing those keywords. You can check or update the added keywords for a chunk from the chunk list. Be aware that extra tokens will be consumed by the LLM specified in 'System model settings'.`,
|
162 |
+
autoQuestions: 'Auto-question',
|
163 |
+
autoQuestionsTip: `Extract N questions for each chunk to improve their ranking for queries containing those questions. You can check or update the added questions for a chunk from the chunk list. This feature will not disrupt the chunking process if an error occurs, except that it may add an empty result to the original chunk. Be aware that extra tokens will be consumed by the LLM specified in 'System model settings'.`,
|
164 |
},
|
165 |
knowledgeConfiguration: {
|
166 |
titleDescription:
|
167 |
+
'Update your knowledge base configurations here, particularly the chunk method.',
|
168 |
name: 'Knowledge base name',
|
169 |
photo: 'Knowledge base photo',
|
170 |
description: 'Description',
|
|
|
176 |
chunkTokenNumber: 'Chunk token number',
|
177 |
chunkTokenNumberMessage: 'Chunk token number is required',
|
178 |
embeddingModelTip:
|
179 |
+
"The model that converts chunks into embeddings. It cannot be changed once the knowledge base has chunks. To switch to a different embedding model, You must delete all chunks in the knowledge base.",
|
180 |
permissionsTip:
|
181 |
+
"If set to 'Team', all team members will be able to manage the knowledge base.",
|
182 |
chunkTokenNumberTip:
|
183 |
+
'It sets the token threshold for a chunk. A paragraph with fewer tokens than this threshold will be combined with the following paragraph until the token count exceeds the threshold, at which point a chunk is created.',
|
184 |
chunkMethod: 'Chunk method',
|
185 |
+
chunkMethodTip: 'Tips are on the right.',
|
186 |
upload: 'Upload',
|
187 |
english: 'English',
|
188 |
chinese: 'Chinese',
|
|
|
192 |
me: 'Only me',
|
193 |
team: 'Team',
|
194 |
cancel: 'Cancel',
|
195 |
+
methodTitle: 'Chunk method description',
|
196 |
methodExamples: 'Examples',
|
197 |
methodExamplesDescription:
|
198 |
+
'The following screenshots are provided for clarity.',
|
199 |
+
dialogueExamplesTitle: 'Dialogue examples',
|
200 |
methodEmpty:
|
201 |
'This will display a visual explanation of the knowledge base categories',
|
202 |
book: `<p>Supported file formats are <b>DOCX</b>, <b>PDF</b>, <b>TXT</b>.</p><p>
|
|
|
208 |
The chunk granularity is consistent with 'ARTICLE', and all the upper level text will be included in the chunk.
|
209 |
</p>`,
|
210 |
manual: `<p>Only <b>PDF</b> is supported.</p><p>
|
211 |
+
We assume that the manual has a hierarchical section structure, using the lowest section titles as basic unit for chunking documents. Therefore, figures and tables in the same section will not be separated, which may result in larger chunk sizes.
|
|
|
212 |
</p>`,
|
213 |
naive: `<p>Supported file formats are <b>DOCX, EXCEL, PPT, IMAGE, PDF, TXT, MD, JSON, EML, HTML</b>.</p>
|
214 |
<p>This method apply the naive ways to chunk files: </p>
|
|
|
291 |
Mind the entiry type you need to specify.</p>`,
|
292 |
useRaptor: 'Use RAPTOR to enhance retrieval',
|
293 |
useRaptorTip:
|
294 |
+
'Recursive Abstractive Processing for Tree-Organized Retrieval, see https://huggingface.co/papers/2401.18059 for more information',
|
295 |
prompt: 'Prompt',
|
296 |
promptTip: 'LLM prompt used for summarization.',
|
297 |
promptMessage: 'Prompt is required',
|