writinwaters commited on
Commit
a3e5fd8
·
1 Parent(s): 6a60ab2

Added release notes for v0.15.0 (#4056)

Browse files

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

README_zh.md CHANGED
@@ -156,7 +156,7 @@
156
  | nightly | ≈9 | :heavy_check_mark: | *Unstable* nightly build |
157
  | nightly-slim | ≈2 | ❌ | *Unstable* nightly build |
158
 
159
- > [!TIP]
160
  > 如果你遇到 Docker 镜像拉不下来的问题,可以在 **docker/.env** 文件内根据变量 `RAGFLOW_IMAGE` 的注释提示选择华为云或者阿里云的相应镜像。
161
  > - 华为云镜像名:`swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow`
162
  > - 阿里云镜像名:`registry.cn-hangzhou.aliyuncs.com/infiniflow/ragflow`
 
156
  | nightly | ≈9 | :heavy_check_mark: | *Unstable* nightly build |
157
  | nightly-slim | ≈2 | ❌ | *Unstable* nightly build |
158
 
159
+ > [!TIP]
160
  > 如果你遇到 Docker 镜像拉不下来的问题,可以在 **docker/.env** 文件内根据变量 `RAGFLOW_IMAGE` 的注释提示选择华为云或者阿里云的相应镜像。
161
  > - 华为云镜像名:`swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow`
162
  > - 阿里云镜像名:`registry.cn-hangzhou.aliyuncs.com/infiniflow/ragflow`
docs/guides/deploy_local_llm.mdx CHANGED
@@ -9,6 +9,8 @@ import TabItem from '@theme/TabItem';
9
 
10
  Run models locally using Ollama, Xinference, or other frameworks.
11
 
 
 
12
  RAGFlow supports deploying models locally using Ollama, Xinference, IPEX-LLM, or jina. If you have locally deployed models to leverage or wish to enable GPU or CUDA for inference acceleration, you can bind Ollama or Xinference into RAGFlow and use either of them as a local "server" for interacting with your local models.
13
 
14
  RAGFlow seamlessly integrates with Ollama and Xinference, without the need for further environment configurations. You can use them to deploy two types of local models in RAGFlow: chat models and embedding models.
 
9
 
10
  Run models locally using Ollama, Xinference, or other frameworks.
11
 
12
+ ---
13
+
14
  RAGFlow supports deploying models locally using Ollama, Xinference, IPEX-LLM, or jina. If you have locally deployed models to leverage or wish to enable GPU or CUDA for inference acceleration, you can bind Ollama or Xinference into RAGFlow and use either of them as a local "server" for interacting with your local models.
15
 
16
  RAGFlow seamlessly integrates with Ollama and Xinference, without the need for further environment configurations. You can use them to deploy two types of local models in RAGFlow: chat models and embedding models.
docs/guides/run_health_check.md CHANGED
@@ -7,6 +7,8 @@ slug: /run_health_check
7
 
8
  Double-check the health status of RAGFlow's dependencies.
9
 
 
 
10
  The operation of RAGFlow depends on four services:
11
 
12
  - **Elasticsearch** (default) or [Infinity](https://github.com/infiniflow/infinity) as the document engine
 
7
 
8
  Double-check the health status of RAGFlow's dependencies.
9
 
10
+ ---
11
+
12
  The operation of RAGFlow depends on four services:
13
 
14
  - **Elasticsearch** (default) or [Infinity](https://github.com/infiniflow/infinity) as the document engine
docs/references/http_api_reference.md CHANGED
@@ -1372,7 +1372,7 @@ curl --request POST \
1372
  - `"model_name"`, `string`
1373
  The chat model name. If not set, the user's default chat model will be used.
1374
  - `"temperature"`: `float`
1375
- Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
1376
  - `"top_p"`: `float`
1377
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
1378
  - `"presence_penalty"`: `float`
@@ -1380,7 +1380,7 @@ curl --request POST \
1380
  - `"frequency penalty"`: `float`
1381
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
1382
  - `"max_token"`: `integer`
1383
- The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
1384
  - `"prompt"`: (*Body parameter*), `object`
1385
  Instructions for the LLM to follow. If it is not explicitly set, a JSON object with the following values will be generated as the default. A `prompt` JSON object contains the following attributes:
1386
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted reranking score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
@@ -1507,7 +1507,7 @@ curl --request PUT \
1507
  - `"model_name"`, `string`
1508
  The chat model name. If not set, the user's default chat model will be used.
1509
  - `"temperature"`: `float`
1510
- Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
1511
  - `"top_p"`: `float`
1512
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
1513
  - `"presence_penalty"`: `float`
@@ -1515,7 +1515,7 @@ curl --request PUT \
1515
  - `"frequency penalty"`: `float`
1516
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
1517
  - `"max_token"`: `integer`
1518
- The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
1519
  - `"prompt"`: (*Body parameter*), `object`
1520
  Instructions for the LLM to follow. A `prompt` object contains the following attributes:
1521
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
@@ -2149,6 +2149,7 @@ Failure:
2149
  ---
2150
 
2151
  ## Create session with agent
 
2152
  *If there are parameters in the `begin` component, the session cannot be created in this way.*
2153
 
2154
  **POST** `/api/v1/agents/{agent_id}/sessions`
 
1372
  - `"model_name"`, `string`
1373
  The chat model name. If not set, the user's default chat model will be used.
1374
  - `"temperature"`: `float`
1375
+ Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
1376
  - `"top_p"`: `float`
1377
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
1378
  - `"presence_penalty"`: `float`
 
1380
  - `"frequency penalty"`: `float`
1381
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
1382
  - `"max_token"`: `integer`
1383
+ The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
1384
  - `"prompt"`: (*Body parameter*), `object`
1385
  Instructions for the LLM to follow. If it is not explicitly set, a JSON object with the following values will be generated as the default. A `prompt` JSON object contains the following attributes:
1386
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted reranking score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
 
1507
  - `"model_name"`, `string`
1508
  The chat model name. If not set, the user's default chat model will be used.
1509
  - `"temperature"`: `float`
1510
+ Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
1511
  - `"top_p"`: `float`
1512
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
1513
  - `"presence_penalty"`: `float`
 
1515
  - `"frequency penalty"`: `float`
1516
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
1517
  - `"max_token"`: `integer`
1518
+ The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
1519
  - `"prompt"`: (*Body parameter*), `object`
1520
  Instructions for the LLM to follow. A `prompt` object contains the following attributes:
1521
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
 
2149
  ---
2150
 
2151
  ## Create session with agent
2152
+
2153
  *If there are parameters in the `begin` component, the session cannot be created in this way.*
2154
 
2155
  **POST** `/api/v1/agents/{agent_id}/sessions`
docs/references/python_api_reference.md CHANGED
@@ -950,7 +950,7 @@ The LLM settings for the chat assistant to create. Defaults to `None`. When the
950
  - `model_name`: `str`
951
  The chat model name. If it is `None`, the user's default chat model will be used.
952
  - `temperature`: `float`
953
- Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
954
  - `top_p`: `float`
955
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
956
  - `presence_penalty`: `float`
@@ -958,7 +958,7 @@ The LLM settings for the chat assistant to create. Defaults to `None`. When the
958
  - `frequency penalty`: `float`
959
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
960
  - `max_token`: `int`
961
- The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
962
 
963
  #### prompt: `Chat.Prompt`
964
 
@@ -1016,11 +1016,11 @@ A dictionary representing the attributes to update, with the following keys:
1016
  - `"dataset_ids"`: `list[str]` The datasets to update.
1017
  - `"llm"`: `dict` The LLM settings:
1018
  - `"model_name"`, `str` The chat model name.
1019
- - `"temperature"`, `float` Controls the randomness of the model's predictions.
1020
  - `"top_p"`, `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from.
1021
  - `"presence_penalty"`, `float` This discourages the model from repeating the same information by penalizing words that have appeared in the conversation.
1022
  - `"frequency penalty"`, `float` Similar to presence penalty, this reduces the model’s tendency to repeat the same words.
1023
- - `"max_token"`, `int` The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
1024
  - `"prompt"` : Instructions for the LLM to follow.
1025
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
1026
  - `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`.
 
950
  - `model_name`: `str`
951
  The chat model name. If it is `None`, the user's default chat model will be used.
952
  - `temperature`: `float`
953
+ Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
954
  - `top_p`: `float`
955
  Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
956
  - `presence_penalty`: `float`
 
958
  - `frequency penalty`: `float`
959
  Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
960
  - `max_token`: `int`
961
+ The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
962
 
963
  #### prompt: `Chat.Prompt`
964
 
 
1016
  - `"dataset_ids"`: `list[str]` The datasets to update.
1017
  - `"llm"`: `dict` The LLM settings:
1018
  - `"model_name"`, `str` The chat model name.
1019
+ - `"temperature"`, `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses.
1020
  - `"top_p"`, `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from.
1021
  - `"presence_penalty"`, `float` This discourages the model from repeating the same information by penalizing words that have appeared in the conversation.
1022
  - `"frequency penalty"`, `float` Similar to presence penalty, this reduces the model’s tendency to repeat the same words.
1023
+ - `"max_token"`, `int` The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
1024
  - `"prompt"` : Instructions for the LLM to follow.
1025
  - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
1026
  - `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`.
docs/release_notes.md CHANGED
@@ -7,6 +7,40 @@ slug: /release_notes
7
 
8
  Key features, improvements and bug fixes in the latest releases.
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ## v0.14.1
11
 
12
  Released on November 29, 2024.
 
7
 
8
  Key features, improvements and bug fixes in the latest releases.
9
 
10
+ ## v0.15.0
11
+
12
+ Released on December 18, 2024.
13
+
14
+ ### New features
15
+
16
+ - Introduces additional Agent-specific APIs.
17
+ - Supports using page rank score to improve retrieval performance when searching across multiple knowledge bases.
18
+ - Offers an iframe in Chat and Agent to facilitate the integration of RAGFlow into your webpage.
19
+ - Adds a Helm chart for deploying RAGFlow on Kubernetes.
20
+ - Supports importing or exporting an agent in JSON format.
21
+ - Supports step run for Agent components/tools.
22
+ - Adds a new UI language: Japanese.
23
+ - Supports resuming GraphRAG and RAPTOR from a failure, enhancing task management resilience.
24
+ - Adds more Mistral models.
25
+ - Adds a dark mode to the UI, allowing users to toggle between light and dark themes.
26
+
27
+ ### Improvements
28
+
29
+ - Upgrades the document layout recognition model in Deepdoc.
30
+ - Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine.
31
+
32
+ ### Related APIs
33
+
34
+ #### HTTP APIs
35
+
36
+ - [List agent sessions](https://ragflow.io/docs/dev/http_api_reference#list-agent-sessions)
37
+ - [List agents](https://ragflow.io/docs/dev/http_api_reference#list-agents)
38
+
39
+ #### Python APIs
40
+
41
+ - [List agent sessions](https://ragflow.io/docs/dev/python_api_reference#list-agent-sessions)
42
+ - [List agents](https://ragflow.io/docs/dev/python_api_reference#list-agents)
43
+
44
  ## v0.14.1
45
 
46
  Released on November 29, 2024.
web/src/locales/en.ts CHANGED
@@ -136,7 +136,7 @@ export default {
136
  toMessage: 'Missing end page number (excluded)',
137
  layoutRecognize: 'Layout recognition',
138
  layoutRecognizeTip:
139
- 'Use visual models for layout analysis to better understand the structure of the document and effectively locate document titles, text blocks, images, and tables. If disabled, only the plain text from the PDF will be retrieved.',
140
  taskPageSize: 'Task page size',
141
  taskPageSizeMessage: 'Please input your task page size!',
142
  taskPageSizeTip: `During layout recognition, a PDF file is split into chunks and processed in parallel to increase processing speed. This parameter sets the size of each chunk. A larger chunk size reduces the likelihood of splitting continuous text between pages.`,
@@ -398,7 +398,7 @@ The above is the content you need to summarize.`,
398
  'Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently.',
399
  maxTokens: 'Max tokens',
400
  maxTokensMessage: 'Max tokens is required',
401
- maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to 512.`,
402
  maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
403
  maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
404
  quote: 'Show quote',
@@ -430,7 +430,7 @@ The above is the content you need to summarize.`,
430
  partialTitle: 'Partial Embed',
431
  extensionTitle: 'Chrome Extension',
432
  tokenError: 'Please create API Token first!',
433
- betaError: 'Please apply an API key in system setting firstly.',
434
  searching: 'Searching...',
435
  parsing: 'Parsing',
436
  uploading: 'Uploading',
@@ -453,7 +453,7 @@ The above is the content you need to summarize.`,
453
  profileDescription: 'Update your photo and personal details here.',
454
  maxTokens: 'Max Tokens',
455
  maxTokensMessage: 'Max Tokens is required',
456
- maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to 512.`,
457
  maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
458
  maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
459
  password: 'Password',
 
136
  toMessage: 'Missing end page number (excluded)',
137
  layoutRecognize: 'Layout recognition',
138
  layoutRecognizeTip:
139
+ 'Use visual models for layout analysis to better understand the structure of the document and effectively locate document titles, text blocks, images, and tables. If disabled, only the plain text in the PDF will be retrieved.',
140
  taskPageSize: 'Task page size',
141
  taskPageSizeMessage: 'Please input your task page size!',
142
  taskPageSizeTip: `During layout recognition, a PDF file is split into chunks and processed in parallel to increase processing speed. This parameter sets the size of each chunk. A larger chunk size reduces the likelihood of splitting continuous text between pages.`,
 
398
  'Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently.',
399
  maxTokens: 'Max tokens',
400
  maxTokensMessage: 'Max tokens is required',
401
+ maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to 512. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.`,
402
  maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
403
  maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
404
  quote: 'Show quote',
 
430
  partialTitle: 'Partial Embed',
431
  extensionTitle: 'Chrome Extension',
432
  tokenError: 'Please create API Token first!',
433
+ betaError: 'Please acquire a RAGFlow API key from the System Settings page first.',
434
  searching: 'Searching...',
435
  parsing: 'Parsing',
436
  uploading: 'Uploading',
 
453
  profileDescription: 'Update your photo and personal details here.',
454
  maxTokens: 'Max Tokens',
455
  maxTokensMessage: 'Max Tokens is required',
456
+ maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to 512. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.`,
457
  maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
458
  maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
459
  password: 'Password',