Zhen Wang commited on
Commit
e7e30bf
·
2 Parent(s): 927d6b8 0c54322

Merge branch 'infiniflow:main' into main

Browse files
README.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -77,12 +78,12 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
77
 
78
  ## 🔥 Latest Updates
79
 
 
80
  - 2024-12-18 Upgrades Document Layout Analysis model in Deepdoc.
81
  - 2024-12-04 Adds support for pagerank score in knowledge base.
82
  - 2024-11-22 Adds more variables to Agent.
83
  - 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
84
  - 2024-08-22 Support text to SQL statements through RAG.
85
- - 2024-08-02 Supports GraphRAG inspired by [graphrag](https://github.com/microsoft/graphrag) and mind map.
86
 
87
  ## 🎉 Stay Tuned
88
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
78
 
79
  ## 🔥 Latest Updates
80
 
81
+ - 2025-01-26 Optimizes knowledge graph extraction and application, offering various configuration options.
82
  - 2024-12-18 Upgrades Document Layout Analysis model in Deepdoc.
83
  - 2024-12-04 Adds support for pagerank score in knowledge base.
84
  - 2024-11-22 Adds more variables to Agent.
85
  - 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
86
  - 2024-08-22 Support text to SQL statements through RAG.
 
87
 
88
  ## 🎉 Stay Tuned
89
 
README_id.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -74,12 +75,12 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
74
 
75
  ## 🔥 Pembaruan Terbaru
76
 
 
77
  - 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di Deepdoc.
78
  - 2024-12-04 Mendukung skor pagerank ke basis pengetahuan.
79
  - 2024-11-22 Peningkatan definisi dan penggunaan variabel di Agen.
80
  - 2024-11-01 Penambahan ekstraksi kata kunci dan pembuatan pertanyaan terkait untuk meningkatkan akurasi pengambilan.
81
  - 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.
82
- - 2024-08-02 Dukungan GraphRAG yang terinspirasi oleh [graphrag](https://github.com/microsoft/graphrag) dan mind map.
83
 
84
  ## 🎉 Tetap Terkini
85
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
75
 
76
  ## 🔥 Pembaruan Terbaru
77
 
78
+ - 2025-01-26 Optimalkan ekstraksi dan penerapan grafik pengetahuan dan sediakan berbagai opsi konfigurasi.
79
  - 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di Deepdoc.
80
  - 2024-12-04 Mendukung skor pagerank ke basis pengetahuan.
81
  - 2024-11-22 Peningkatan definisi dan penggunaan variabel di Agen.
82
  - 2024-11-01 Penambahan ekstraksi kata kunci dan pembuatan pertanyaan terkait untuk meningkatkan akurasi pengambilan.
83
  - 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.
 
84
 
85
  ## 🎉 Tetap Terkini
86
 
README_ja.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,12 +55,12 @@
54
 
55
  ## 🔥 最新情報
56
 
 
57
  - 2024-12-18 Deepdoc のドキュメント レイアウト分析モデルをアップグレードします。
58
  - 2024-12-04 ナレッジ ベースへのページランク スコアをサポートしました。
59
  - 2024-11-22 エージェントでの変数の定義と使用法を改善しました。
60
  - 2024-11-01 再現の精度を向上させるために、解析されたチャンクにキーワード抽出と関連質問の生成を追加しました。
61
  - 2024-08-22 RAG を介して SQL ステートメントへのテキストをサポートします。
62
- - 2024-08-02 [graphrag](https://github.com/microsoft/graphrag) からインスピレーションを得た GraphRAG とマインド マップをサポートします。
63
 
64
  ## 🎉 続きを楽しみに
65
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
55
 
56
  ## 🔥 最新情報
57
 
58
+ - 2025-01-26 ナレッジ グラフの抽出と適用を最適化し、さまざまな構成オプションを提供します。
59
  - 2024-12-18 Deepdoc のドキュメント レイアウト分析モデルをアップグレードします。
60
  - 2024-12-04 ナレッジ ベースへのページランク スコアをサポートしました。
61
  - 2024-11-22 エージェントでの変数の定義と使用法を改善しました。
62
  - 2024-11-01 再現の精度を向上させるために、解析されたチャンクにキーワード抽出と関連質問の生成を追加しました。
63
  - 2024-08-22 RAG を介して SQL ステートメントへのテキストをサポートします。
 
64
 
65
  ## 🎉 続きを楽しみに
66
 
README_ko.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,13 +55,13 @@
54
 
55
  ## 🔥 업데이트
56
 
 
57
  - 2024-12-18 Deepdoc의 문서 레이아웃 분석 모델 업그레이드.
58
  - 2024-12-04 지식베이스에 대한 페이지랭크 점수를 지원합니다.
59
 
60
  - 2024-11-22 에이전트의 변수 정의 및 사용을 개선했습니다.
61
  - 2024-11-01 파싱된 청크에 키워드 추출 및 관련 질문 생성을 추가하여 재현율을 향상시킵니다.
62
  - 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
63
- - 2024-08-02: [graphrag](https://github.com/microsoft/graphrag)와 마인드맵에서 영감을 받은 GraphRAG를 지원합니다.
64
 
65
  ## 🎉 계속 지켜봐 주세요
66
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
55
 
56
  ## 🔥 업데이트
57
 
58
+ - 2025-01-26 지식 그래프 추출 및 적용을 최적화하고 다양한 구성 옵션을 제공합니다.
59
  - 2024-12-18 Deepdoc의 문서 레이아웃 분석 모델 업그레이드.
60
  - 2024-12-04 지식베이스에 대한 페이지랭크 점수를 지원합니다.
61
 
62
  - 2024-11-22 에이전트의 변수 정의 및 사용을 개선했습니다.
63
  - 2024-11-01 파싱된 청크에 키워드 추출 및 관련 질문 생성을 추가하여 재현율을 향상시킵니다.
64
  - 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
 
65
 
66
  ## 🎉 계속 지켜봐 주세요
67
 
README_pt_br.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -74,12 +75,12 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
74
 
75
  ## 🔥 Últimas Atualizações
76
 
 
77
  - 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no Deepdoc.
78
  - 04-12-2024 Adiciona suporte para pontuação de pagerank na base de conhecimento.
79
  - 22-11-2024 Adiciona mais variáveis para o Agente.
80
  - 01-11-2024 Adiciona extração de palavras-chave e geração de perguntas relacionadas aos blocos analisados para melhorar a precisão da recuperação.
81
  - 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.
82
- - 02-08-2024 Suporta GraphRAG inspirado pelo [graphrag](https://github.com/microsoft/graphrag) e mapa mental.
83
 
84
  ## 🎉 Fique Ligado
85
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
75
 
76
  ## 🔥 Últimas Atualizações
77
 
78
+ - 26-01-2025 Otimize a extração e aplicação de gráficos de conhecimento e forneça uma variedade de opções de configuração.
79
  - 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no Deepdoc.
80
  - 04-12-2024 Adiciona suporte para pontuação de pagerank na base de conhecimento.
81
  - 22-11-2024 Adiciona mais variáveis para o Agente.
82
  - 01-11-2024 Adiciona extração de palavras-chave e geração de perguntas relacionadas aos blocos analisados para melhorar a precisão da recuperação.
83
  - 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.
 
84
 
85
  ## 🎉 Fique Ligado
86
 
README_tzh.md CHANGED
@@ -54,12 +54,12 @@
54
 
55
  ## 🔥 近期更新
56
 
 
57
  - 2024-12-18 升級了 Deepdoc 的文檔佈局分析模型。
58
  - 2024-12-04 支援知識庫的 Pagerank 分數。
59
  - 2024-11-22 完善了 Agent 中的變數定義和使用。
60
  - 2024-11-01 對解析後的 chunk 加入關鍵字抽取和相關問題產生以提高回想的準確度。
61
  - 2024-08-22 支援用 RAG 技術實現從自然語言到 SQL 語句的轉換。
62
- - 2024-08-02 支持 GraphRAG 啟發於 [graphrag](https://github.com/microsoft/graphrag) 和心智圖。
63
 
64
  ## 🎉 關注項目
65
 
 
54
 
55
  ## 🔥 近期更新
56
 
57
+ - 2025-01-26 最佳化知識圖譜的擷取與應用,提供了多種配置選擇。
58
  - 2024-12-18 升級了 Deepdoc 的文檔佈局分析模型。
59
  - 2024-12-04 支援知識庫的 Pagerank 分數。
60
  - 2024-11-22 完善了 Agent 中的變數定義和使用。
61
  - 2024-11-01 對解析後的 chunk 加入關鍵字抽取和相關問題產生以提高回想的準確度。
62
  - 2024-08-22 支援用 RAG 技術實現從自然語言到 SQL 語句的轉換。
 
63
 
64
  ## 🎉 關注項目
65
 
README_zh.md CHANGED
@@ -7,6 +7,7 @@
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
 
10
  <a href="./README_ja.md">日本語</a> |
11
  <a href="./README_ko.md">한국어</a> |
12
  <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,12 +55,12 @@
54
 
55
  ## 🔥 近期更新
56
 
 
57
  - 2024-12-18 升级了 Deepdoc 的文档布局分析模型。
58
  - 2024-12-04 支持知识库的 Pagerank 分数。
59
  - 2024-11-22 完善了 Agent 中的变量定义和使用。
60
  - 2024-11-01 对解析后的 chunk 加入关键词抽取和相关问题生成以提高召回的准确度。
61
  - 2024-08-22 支持用 RAG 技术实现从自然语言到 SQL 语句的转换。
62
- - 2024-08-02 支持 GraphRAG 启发于 [graphrag](https://github.com/microsoft/graphrag) 和思维导图。
63
 
64
  ## 🎉 关注项目
65
 
 
7
  <p align="center">
8
  <a href="./README.md">English</a> |
9
  <a href="./README_zh.md">简体中文</a> |
10
+ <a href="./README_tzh.md">繁体中文</a> |
11
  <a href="./README_ja.md">日本語</a> |
12
  <a href="./README_ko.md">한국어</a> |
13
  <a href="./README_id.md">Bahasa Indonesia</a> |
 
55
 
56
  ## 🔥 近期更新
57
 
58
+ - 2025-01-26 优化知识图谱的提取和应用,提供了多种配置选择。
59
  - 2024-12-18 升级了 Deepdoc 的文档布局分析模型。
60
  - 2024-12-04 支持知识库的 Pagerank 分数。
61
  - 2024-11-22 完善了 Agent 中的变量定义和使用。
62
  - 2024-11-01 对解析后的 chunk 加入关键词抽取和相关问题生成以提高召回的准确度。
63
  - 2024-08-22 支持用 RAG 技术实现从自然语言到 SQL 语句的转换。
 
64
 
65
  ## 🎉 关注项目
66
 
api/apps/kb_app.py CHANGED
@@ -24,6 +24,7 @@ from api.db.services.document_service import DocumentService
24
  from api.db.services.file2document_service import File2DocumentService
25
  from api.db.services.file_service import FileService
26
  from api.db.services.user_service import TenantService, UserTenantService
 
27
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request, not_allowed_parameters
28
  from api.utils import get_uuid
29
  from api.db import StatusEnum, FileSource
@@ -96,6 +97,13 @@ def update():
96
  return get_data_error_result(
97
  message="Can't find this knowledgebase!")
98
 
 
 
 
 
 
 
 
99
  if req["name"].lower() != kb.name.lower() \
100
  and len(
101
  KnowledgebaseService.query(name=req["name"], tenant_id=current_user.id, status=StatusEnum.VALID.value)) > 1:
@@ -112,7 +120,7 @@ def update():
112
  search.index_name(kb.tenant_id), kb.id)
113
  else:
114
  # Elasticsearch requires PAGERANK_FLD be non-zero!
115
- settings.docStoreConn.update({"exist": PAGERANK_FLD}, {"remove": PAGERANK_FLD},
116
  search.index_name(kb.tenant_id), kb.id)
117
 
118
  e, kb = KnowledgebaseService.get_by_id(kb.id)
 
24
  from api.db.services.file2document_service import File2DocumentService
25
  from api.db.services.file_service import FileService
26
  from api.db.services.user_service import TenantService, UserTenantService
27
+ from api.settings import DOC_ENGINE
28
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request, not_allowed_parameters
29
  from api.utils import get_uuid
30
  from api.db import StatusEnum, FileSource
 
97
  return get_data_error_result(
98
  message="Can't find this knowledgebase!")
99
 
100
+ if req.get("parser_id", "") == "tag" and DOC_ENGINE == "infinity":
101
+ return get_json_result(
102
+ data=False,
103
+ message='The chunk method Tag has not been supported by Infinity yet.',
104
+ code=settings.RetCode.OPERATING_ERROR
105
+ )
106
+
107
  if req["name"].lower() != kb.name.lower() \
108
  and len(
109
  KnowledgebaseService.query(name=req["name"], tenant_id=current_user.id, status=StatusEnum.VALID.value)) > 1:
 
120
  search.index_name(kb.tenant_id), kb.id)
121
  else:
122
  # Elasticsearch requires PAGERANK_FLD be non-zero!
123
+ settings.docStoreConn.update({"exists": PAGERANK_FLD}, {"remove": PAGERANK_FLD},
124
  search.index_name(kb.tenant_id), kb.id)
125
 
126
  e, kb = KnowledgebaseService.get_by_id(kb.id)
docs/references/agent_component_reference/concentrator.mdx CHANGED
@@ -18,7 +18,7 @@ A **Concentrator** component enhances the current UX design. For a component ori
18
 
19
  ## Examples
20
 
21
- Explore our general-purpose chatbot agent template, featuring a **Concentrator** component (component ID: **medical**) that relays an execution flow from category 2 of the **Categorize** component to the two translator components:
22
 
23
  1. Click the **Agent** tab at the top center of the page to access the **Agent** page.
24
  2. Click **+ Create agent** on the top right of the page to open the **agent template** page.
 
18
 
19
  ## Examples
20
 
21
+ Explore our general-purpose chatbot agent template, featuring a **Concentrator** component (component ID: **medical**) that relays an execution flow from category 2 of the **Categorize** component to two translator components:
22
 
23
  1. Click the **Agent** tab at the top center of the page to access the **Agent** page.
24
  2. Click **+ Create agent** on the top right of the page to open the **agent template** page.
docs/references/supported_models.mdx CHANGED
@@ -12,7 +12,7 @@ A complete list of models supported by RAGFlow, which will continue to expand.
12
  <APITable>
13
  ```
14
 
15
- | Provider | Chat | Embedding | Rerank | Multimodal | ASR | TTS |
16
  | --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |
17
  | Anthropic | :heavy_check_mark: | | | | | |
18
  | Azure-OpenAI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | |
@@ -26,6 +26,7 @@ A complete list of models supported by RAGFlow, which will continue to expand.
26
  | Fish Audio | | | | | | :heavy_check_mark: |
27
  | Gemini | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | |
28
  | Google Cloud | :heavy_check_mark: | | | | | |
 
29
  | Groq | :heavy_check_mark: | | | | | |
30
  | HuggingFace | :heavy_check_mark: | :heavy_check_mark: | | | | |
31
  | Jina | | :heavy_check_mark: | :heavy_check_mark: | | | |
 
12
  <APITable>
13
  ```
14
 
15
+ | Provider | Chat | Embedding | Rerank | Img2txt | Sequence2txt | TTS |
16
  | --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |
17
  | Anthropic | :heavy_check_mark: | | | | | |
18
  | Azure-OpenAI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | |
 
26
  | Fish Audio | | | | | | :heavy_check_mark: |
27
  | Gemini | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | |
28
  | Google Cloud | :heavy_check_mark: | | | | | |
29
+ | GPUStack | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: |
30
  | Groq | :heavy_check_mark: | | | | | |
31
  | HuggingFace | :heavy_check_mark: | :heavy_check_mark: | | | | |
32
  | Jina | | :heavy_check_mark: | :heavy_check_mark: | | | |
graphrag/general/graph_extractor.py CHANGED
@@ -135,7 +135,7 @@ class GraphExtractor(Extractor):
135
  break
136
  history.append({"role": "assistant", "content": response})
137
  history.append({"role": "user", "content": LOOP_PROMPT})
138
- continuation = self._chat("", history, self._loop_args)
139
  token_count += num_tokens_from_string("\n".join([m["content"] for m in history]) + response)
140
  if continuation != "YES":
141
  break
 
135
  break
136
  history.append({"role": "assistant", "content": response})
137
  history.append({"role": "user", "content": LOOP_PROMPT})
138
+ continuation = self._chat("", history, {"temperature": 0.8})
139
  token_count += num_tokens_from_string("\n".join([m["content"] for m in history]) + response)
140
  if continuation != "YES":
141
  break
pyproject.toml CHANGED
@@ -59,8 +59,8 @@ dependencies = [
59
  "nltk==3.9.1",
60
  "numpy>=1.26.0,<2.0.0",
61
  "ollama==0.2.1",
62
- "onnxruntime==1.19.2; sys_platform == 'darwin' or platform_machine == 'arm64'",
63
- "onnxruntime-gpu==1.19.2; platform_machine == 'x86_64'",
64
  "openai==1.45.0",
65
  "opencv-python==4.10.0.84",
66
  "opencv-python-headless==4.10.0.84",
@@ -128,8 +128,8 @@ dependencies = [
128
  [project.optional-dependencies]
129
  full = [
130
  "bcembedding==0.1.5",
131
- "fastembed>=0.3.6,<0.4.0; sys_platform == 'darwin' or platform_machine == 'arm64'",
132
- "fastembed-gpu>=0.3.6,<0.4.0; platform_machine == 'x86_64'",
133
  "flagembedding==1.2.10",
134
  "torch>=2.5.0,<3.0.0",
135
  "transformers>=4.35.0,<5.0.0"
 
59
  "nltk==3.9.1",
60
  "numpy>=1.26.0,<2.0.0",
61
  "ollama==0.2.1",
62
+ "onnxruntime==1.19.2; sys_platform == 'darwin' or platform_machine != 'x86_64'",
63
+ "onnxruntime-gpu==1.19.2; sys_platform != 'darwin' and platform_machine == 'x86_64'",
64
  "openai==1.45.0",
65
  "opencv-python==4.10.0.84",
66
  "opencv-python-headless==4.10.0.84",
 
128
  [project.optional-dependencies]
129
  full = [
130
  "bcembedding==0.1.5",
131
+ "fastembed>=0.3.6,<0.4.0; sys_platform == 'darwin' or platform_machine != 'x86_64'",
132
+ "fastembed-gpu>=0.3.6,<0.4.0; sys_platform != 'darwin' and platform_machine == 'x86_64'",
133
  "flagembedding==1.2.10",
134
  "torch>=2.5.0,<3.0.0",
135
  "transformers>=4.35.0,<5.0.0"
rag/app/table.py CHANGED
@@ -102,9 +102,9 @@ def column_data_type(arr):
102
  for a in arr:
103
  if a is None:
104
  continue
105
- if re.match(r"[+-]?[0-9]+(\.0+)?$", str(a).replace("%%", "")):
106
  counts["int"] += 1
107
- elif re.match(r"[+-]?[0-9.]+$", str(a).replace("%%", "")):
108
  counts["float"] += 1
109
  elif re.match(r"(true|yes|是|\*|✓|✔|☑|✅|√|false|no|否|⍻|×)$", str(a), flags=re.IGNORECASE):
110
  counts["bool"] += 1
 
102
  for a in arr:
103
  if a is None:
104
  continue
105
+ if re.match(r"[+-]?[0-9]{,19}(\.0+)?$", str(a).replace("%%", "")):
106
  counts["int"] += 1
107
+ elif re.match(r"[+-]?[0-9.]{,19}$", str(a).replace("%%", "")):
108
  counts["float"] += 1
109
  elif re.match(r"(true|yes|是|\*|✓|✔|☑|✅|√|false|no|否|⍻|×)$", str(a), flags=re.IGNORECASE):
110
  counts["bool"] += 1
rag/llm/__init__.py CHANGED
@@ -13,6 +13,8 @@
13
  # See the License for the specific language governing permissions and
14
  # limitations under the License.
15
  #
 
 
16
  from .embedding_model import (
17
  OllamaEmbed,
18
  LocalAIEmbed,
 
13
  # See the License for the specific language governing permissions and
14
  # limitations under the License.
15
  #
16
+ # AFTER UPDATING THIS FILE, PLEASE ENSURE THAT docs/references/supported_models.mdx IS ALSO UPDATED for consistency!
17
+ #
18
  from .embedding_model import (
19
  OllamaEmbed,
20
  LocalAIEmbed,
rag/llm/chat_model.py CHANGED
@@ -53,7 +53,7 @@ class Base(ABC):
53
  ans += LENGTH_NOTIFICATION_CN
54
  else:
55
  ans += LENGTH_NOTIFICATION_EN
56
- return ans, response.usage.total_tokens
57
  except openai.APIError as e:
58
  return "**ERROR**: " + str(e), 0
59
 
@@ -75,15 +75,11 @@ class Base(ABC):
75
  resp.choices[0].delta.content = ""
76
  ans += resp.choices[0].delta.content
77
 
78
- if not hasattr(resp, "usage") or not resp.usage:
79
- total_tokens = (
80
- total_tokens
81
- + num_tokens_from_string(resp.choices[0].delta.content)
82
- )
83
- elif isinstance(resp.usage, dict):
84
- total_tokens = resp.usage.get("total_tokens", total_tokens)
85
  else:
86
- total_tokens = resp.usage.total_tokens
87
 
88
  if resp.choices[0].finish_reason == "length":
89
  if is_chinese(ans):
@@ -97,6 +93,17 @@ class Base(ABC):
97
 
98
  yield total_tokens
99
 
 
 
 
 
 
 
 
 
 
 
 
100
 
101
  class GptTurbo(Base):
102
  def __init__(self, key, model_name="gpt-3.5-turbo", base_url="https://api.openai.com/v1"):
@@ -182,7 +189,7 @@ class BaiChuanChat(Base):
182
  ans += LENGTH_NOTIFICATION_CN
183
  else:
184
  ans += LENGTH_NOTIFICATION_EN
185
- return ans, response.usage.total_tokens
186
  except openai.APIError as e:
187
  return "**ERROR**: " + str(e), 0
188
 
@@ -212,14 +219,11 @@ class BaiChuanChat(Base):
212
  if not resp.choices[0].delta.content:
213
  resp.choices[0].delta.content = ""
214
  ans += resp.choices[0].delta.content
215
- total_tokens = (
216
- (
217
- total_tokens
218
- + num_tokens_from_string(resp.choices[0].delta.content)
219
- )
220
- if not hasattr(resp, "usage")
221
- else resp.usage["total_tokens"]
222
- )
223
  if resp.choices[0].finish_reason == "length":
224
  if is_chinese([ans]):
225
  ans += LENGTH_NOTIFICATION_CN
@@ -256,7 +260,7 @@ class QWenChat(Base):
256
  tk_count = 0
257
  if response.status_code == HTTPStatus.OK:
258
  ans += response.output.choices[0]['message']['content']
259
- tk_count += response.usage.total_tokens
260
  if response.output.choices[0].get("finish_reason", "") == "length":
261
  if is_chinese([ans]):
262
  ans += LENGTH_NOTIFICATION_CN
@@ -292,7 +296,7 @@ class QWenChat(Base):
292
  for resp in response:
293
  if resp.status_code == HTTPStatus.OK:
294
  ans = resp.output.choices[0]['message']['content']
295
- tk_count = resp.usage.total_tokens
296
  if resp.output.choices[0].get("finish_reason", "") == "length":
297
  if is_chinese(ans):
298
  ans += LENGTH_NOTIFICATION_CN
@@ -334,7 +338,7 @@ class ZhipuChat(Base):
334
  ans += LENGTH_NOTIFICATION_CN
335
  else:
336
  ans += LENGTH_NOTIFICATION_EN
337
- return ans, response.usage.total_tokens
338
  except Exception as e:
339
  return "**ERROR**: " + str(e), 0
340
 
@@ -364,9 +368,9 @@ class ZhipuChat(Base):
364
  ans += LENGTH_NOTIFICATION_CN
365
  else:
366
  ans += LENGTH_NOTIFICATION_EN
367
- tk_count = resp.usage.total_tokens
368
  if resp.choices[0].finish_reason == "stop":
369
- tk_count = resp.usage.total_tokens
370
  yield ans
371
  except Exception as e:
372
  yield ans + "\n**ERROR**: " + str(e)
@@ -569,7 +573,7 @@ class MiniMaxChat(Base):
569
  ans += LENGTH_NOTIFICATION_CN
570
  else:
571
  ans += LENGTH_NOTIFICATION_EN
572
- return ans, response["usage"]["total_tokens"]
573
  except Exception as e:
574
  return "**ERROR**: " + str(e), 0
575
 
@@ -603,11 +607,11 @@ class MiniMaxChat(Base):
603
  if "choices" in resp and "delta" in resp["choices"][0]:
604
  text = resp["choices"][0]["delta"]["content"]
605
  ans += text
606
- total_tokens = (
607
- total_tokens + num_tokens_from_string(text)
608
- if "usage" not in resp
609
- else resp["usage"]["total_tokens"]
610
- )
611
  yield ans
612
 
613
  except Exception as e:
@@ -640,7 +644,7 @@ class MistralChat(Base):
640
  ans += LENGTH_NOTIFICATION_CN
641
  else:
642
  ans += LENGTH_NOTIFICATION_EN
643
- return ans, response.usage.total_tokens
644
  except openai.APIError as e:
645
  return "**ERROR**: " + str(e), 0
646
 
@@ -838,7 +842,7 @@ class GeminiChat(Base):
838
  yield 0
839
 
840
 
841
- class GroqChat:
842
  def __init__(self, key, model_name, base_url=''):
843
  from groq import Groq
844
  self.client = Groq(api_key=key)
@@ -863,7 +867,7 @@ class GroqChat:
863
  ans += LENGTH_NOTIFICATION_CN
864
  else:
865
  ans += LENGTH_NOTIFICATION_EN
866
- return ans, response.usage.total_tokens
867
  except Exception as e:
868
  return ans + "\n**ERROR**: " + str(e), 0
869
 
@@ -1255,7 +1259,7 @@ class BaiduYiyanChat(Base):
1255
  **gen_conf
1256
  ).body
1257
  ans = response['result']
1258
- return ans, response["usage"]["total_tokens"]
1259
 
1260
  except Exception as e:
1261
  return ans + "\n**ERROR**: " + str(e), 0
@@ -1283,7 +1287,7 @@ class BaiduYiyanChat(Base):
1283
  for resp in response:
1284
  resp = resp.body
1285
  ans += resp['result']
1286
- total_tokens = resp["usage"]["total_tokens"]
1287
 
1288
  yield ans
1289
 
 
53
  ans += LENGTH_NOTIFICATION_CN
54
  else:
55
  ans += LENGTH_NOTIFICATION_EN
56
+ return ans, self.total_token_count(response)
57
  except openai.APIError as e:
58
  return "**ERROR**: " + str(e), 0
59
 
 
75
  resp.choices[0].delta.content = ""
76
  ans += resp.choices[0].delta.content
77
 
78
+ tol = self.total_token_count(resp)
79
+ if not tol:
80
+ total_tokens += num_tokens_from_string(resp.choices[0].delta.content)
 
 
 
 
81
  else:
82
+ total_tokens = tol
83
 
84
  if resp.choices[0].finish_reason == "length":
85
  if is_chinese(ans):
 
93
 
94
  yield total_tokens
95
 
96
+ def total_token_count(self, resp):
97
+ try:
98
+ return resp.usage.total_tokens
99
+ except Exception:
100
+ pass
101
+ try:
102
+ return resp["usage"]["total_tokens"]
103
+ except Exception:
104
+ pass
105
+ return 0
106
+
107
 
108
  class GptTurbo(Base):
109
  def __init__(self, key, model_name="gpt-3.5-turbo", base_url="https://api.openai.com/v1"):
 
189
  ans += LENGTH_NOTIFICATION_CN
190
  else:
191
  ans += LENGTH_NOTIFICATION_EN
192
+ return ans, self.total_token_count(response)
193
  except openai.APIError as e:
194
  return "**ERROR**: " + str(e), 0
195
 
 
219
  if not resp.choices[0].delta.content:
220
  resp.choices[0].delta.content = ""
221
  ans += resp.choices[0].delta.content
222
+ tol = self.total_token_count(resp)
223
+ if not tol:
224
+ total_tokens += num_tokens_from_string(resp.choices[0].delta.content)
225
+ else:
226
+ total_tokens = tol
 
 
 
227
  if resp.choices[0].finish_reason == "length":
228
  if is_chinese([ans]):
229
  ans += LENGTH_NOTIFICATION_CN
 
260
  tk_count = 0
261
  if response.status_code == HTTPStatus.OK:
262
  ans += response.output.choices[0]['message']['content']
263
+ tk_count += self.total_token_count(response)
264
  if response.output.choices[0].get("finish_reason", "") == "length":
265
  if is_chinese([ans]):
266
  ans += LENGTH_NOTIFICATION_CN
 
296
  for resp in response:
297
  if resp.status_code == HTTPStatus.OK:
298
  ans = resp.output.choices[0]['message']['content']
299
+ tk_count = self.total_token_count(resp)
300
  if resp.output.choices[0].get("finish_reason", "") == "length":
301
  if is_chinese(ans):
302
  ans += LENGTH_NOTIFICATION_CN
 
338
  ans += LENGTH_NOTIFICATION_CN
339
  else:
340
  ans += LENGTH_NOTIFICATION_EN
341
+ return ans, self.total_token_count(response)
342
  except Exception as e:
343
  return "**ERROR**: " + str(e), 0
344
 
 
368
  ans += LENGTH_NOTIFICATION_CN
369
  else:
370
  ans += LENGTH_NOTIFICATION_EN
371
+ tk_count = self.total_token_count(resp)
372
  if resp.choices[0].finish_reason == "stop":
373
+ tk_count = self.total_token_count(resp)
374
  yield ans
375
  except Exception as e:
376
  yield ans + "\n**ERROR**: " + str(e)
 
573
  ans += LENGTH_NOTIFICATION_CN
574
  else:
575
  ans += LENGTH_NOTIFICATION_EN
576
+ return ans, self.total_token_count(response)
577
  except Exception as e:
578
  return "**ERROR**: " + str(e), 0
579
 
 
607
  if "choices" in resp and "delta" in resp["choices"][0]:
608
  text = resp["choices"][0]["delta"]["content"]
609
  ans += text
610
+ tol = self.total_token_count(resp)
611
+ if not tol:
612
+ total_tokens += num_tokens_from_string(text)
613
+ else:
614
+ total_tokens = tol
615
  yield ans
616
 
617
  except Exception as e:
 
644
  ans += LENGTH_NOTIFICATION_CN
645
  else:
646
  ans += LENGTH_NOTIFICATION_EN
647
+ return ans, self.total_token_count(response)
648
  except openai.APIError as e:
649
  return "**ERROR**: " + str(e), 0
650
 
 
842
  yield 0
843
 
844
 
845
+ class GroqChat(Base):
846
  def __init__(self, key, model_name, base_url=''):
847
  from groq import Groq
848
  self.client = Groq(api_key=key)
 
867
  ans += LENGTH_NOTIFICATION_CN
868
  else:
869
  ans += LENGTH_NOTIFICATION_EN
870
+ return ans, self.total_token_count(response)
871
  except Exception as e:
872
  return ans + "\n**ERROR**: " + str(e), 0
873
 
 
1259
  **gen_conf
1260
  ).body
1261
  ans = response['result']
1262
+ return ans, self.total_token_count(response)
1263
 
1264
  except Exception as e:
1265
  return ans + "\n**ERROR**: " + str(e), 0
 
1287
  for resp in response:
1288
  resp = resp.body
1289
  ans += resp['result']
1290
+ total_tokens = self.total_token_count(resp)
1291
 
1292
  yield ans
1293
 
rag/llm/embedding_model.py CHANGED
@@ -44,11 +44,23 @@ class Base(ABC):
44
  def encode_queries(self, text: str):
45
  raise NotImplementedError("Please implement encode method!")
46
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  class DefaultEmbedding(Base):
49
  _model = None
50
  _model_name = ""
51
  _model_lock = threading.Lock()
 
52
  def __init__(self, key, model_name, **kwargs):
53
  """
54
  If you have trouble downloading HuggingFace models, -_^ this might help!!
@@ -115,13 +127,13 @@ class OpenAIEmbed(Base):
115
  res = self.client.embeddings.create(input=texts[i:i + batch_size],
116
  model=self.model_name)
117
  ress.extend([d.embedding for d in res.data])
118
- total_tokens += res.usage.total_tokens
119
  return np.array(ress), total_tokens
120
 
121
  def encode_queries(self, text):
122
  res = self.client.embeddings.create(input=[truncate(text, 8191)],
123
  model=self.model_name)
124
- return np.array(res.data[0].embedding), res.usage.total_tokens
125
 
126
 
127
  class LocalAIEmbed(Base):
@@ -188,7 +200,7 @@ class QWenEmbed(Base):
188
  for e in resp["output"]["embeddings"]:
189
  embds[e["text_index"]] = e["embedding"]
190
  res.extend(embds)
191
- token_count += resp["usage"]["total_tokens"]
192
  return np.array(res), token_count
193
  except Exception as e:
194
  raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
@@ -203,7 +215,7 @@ class QWenEmbed(Base):
203
  text_type="query"
204
  )
205
  return np.array(resp["output"]["embeddings"][0]
206
- ["embedding"]), resp["usage"]["total_tokens"]
207
  except Exception:
208
  raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
209
  return np.array([]), 0
@@ -229,13 +241,13 @@ class ZhipuEmbed(Base):
229
  res = self.client.embeddings.create(input=txt,
230
  model=self.model_name)
231
  arr.append(res.data[0].embedding)
232
- tks_num += res.usage.total_tokens
233
  return np.array(arr), tks_num
234
 
235
  def encode_queries(self, text):
236
  res = self.client.embeddings.create(input=text,
237
  model=self.model_name)
238
- return np.array(res.data[0].embedding), res.usage.total_tokens
239
 
240
 
241
  class OllamaEmbed(Base):
@@ -318,13 +330,13 @@ class XinferenceEmbed(Base):
318
  for i in range(0, len(texts), batch_size):
319
  res = self.client.embeddings.create(input=texts[i:i + batch_size], model=self.model_name)
320
  ress.extend([d.embedding for d in res.data])
321
- total_tokens += res.usage.total_tokens
322
  return np.array(ress), total_tokens
323
 
324
  def encode_queries(self, text):
325
  res = self.client.embeddings.create(input=[text],
326
  model=self.model_name)
327
- return np.array(res.data[0].embedding), res.usage.total_tokens
328
 
329
 
330
  class YoudaoEmbed(Base):
@@ -383,7 +395,7 @@ class JinaEmbed(Base):
383
  }
384
  res = requests.post(self.base_url, headers=self.headers, json=data).json()
385
  ress.extend([d["embedding"] for d in res["data"]])
386
- token_count += res["usage"]["total_tokens"]
387
  return np.array(ress), token_count
388
 
389
  def encode_queries(self, text):
@@ -447,13 +459,13 @@ class MistralEmbed(Base):
447
  res = self.client.embeddings(input=texts[i:i + batch_size],
448
  model=self.model_name)
449
  ress.extend([d.embedding for d in res.data])
450
- token_count += res.usage.total_tokens
451
  return np.array(ress), token_count
452
 
453
  def encode_queries(self, text):
454
  res = self.client.embeddings(input=[truncate(text, 8196)],
455
  model=self.model_name)
456
- return np.array(res.data[0].embedding), res.usage.total_tokens
457
 
458
 
459
  class BedrockEmbed(Base):
@@ -565,7 +577,7 @@ class NvidiaEmbed(Base):
565
  }
566
  res = requests.post(self.base_url, headers=self.headers, json=payload).json()
567
  ress.extend([d["embedding"] for d in res["data"]])
568
- token_count += res["usage"]["total_tokens"]
569
  return np.array(ress), token_count
570
 
571
  def encode_queries(self, text):
@@ -677,7 +689,7 @@ class SILICONFLOWEmbed(Base):
677
  if "data" not in res or not isinstance(res["data"], list) or len(res["data"]) != len(texts_batch):
678
  raise ValueError(f"SILICONFLOWEmbed.encode got invalid response from {self.base_url}")
679
  ress.extend([d["embedding"] for d in res["data"]])
680
- token_count += res["usage"]["total_tokens"]
681
  return np.array(ress), token_count
682
 
683
  def encode_queries(self, text):
@@ -689,7 +701,7 @@ class SILICONFLOWEmbed(Base):
689
  res = requests.post(self.base_url, json=payload, headers=self.headers).json()
690
  if "data" not in res or not isinstance(res["data"], list) or len(res["data"])!= 1:
691
  raise ValueError(f"SILICONFLOWEmbed.encode_queries got invalid response from {self.base_url}")
692
- return np.array(res["data"][0]["embedding"]), res["usage"]["total_tokens"]
693
 
694
 
695
  class ReplicateEmbed(Base):
@@ -727,14 +739,14 @@ class BaiduYiyanEmbed(Base):
727
  res = self.client.do(model=self.model_name, texts=texts).body
728
  return (
729
  np.array([r["embedding"] for r in res["data"]]),
730
- res["usage"]["total_tokens"],
731
  )
732
 
733
  def encode_queries(self, text):
734
  res = self.client.do(model=self.model_name, texts=[text]).body
735
  return (
736
  np.array([r["embedding"] for r in res["data"]]),
737
- res["usage"]["total_tokens"],
738
  )
739
 
740
 
 
44
  def encode_queries(self, text: str):
45
  raise NotImplementedError("Please implement encode method!")
46
 
47
+ def total_token_count(self, resp):
48
+ try:
49
+ return resp.usage.total_tokens
50
+ except Exception:
51
+ pass
52
+ try:
53
+ return resp["usage"]["total_tokens"]
54
+ except Exception:
55
+ pass
56
+ return 0
57
+
58
 
59
  class DefaultEmbedding(Base):
60
  _model = None
61
  _model_name = ""
62
  _model_lock = threading.Lock()
63
+
64
  def __init__(self, key, model_name, **kwargs):
65
  """
66
  If you have trouble downloading HuggingFace models, -_^ this might help!!
 
127
  res = self.client.embeddings.create(input=texts[i:i + batch_size],
128
  model=self.model_name)
129
  ress.extend([d.embedding for d in res.data])
130
+ total_tokens += self.total_token_count(res)
131
  return np.array(ress), total_tokens
132
 
133
  def encode_queries(self, text):
134
  res = self.client.embeddings.create(input=[truncate(text, 8191)],
135
  model=self.model_name)
136
+ return np.array(res.data[0].embedding), self.total_token_count(res)
137
 
138
 
139
  class LocalAIEmbed(Base):
 
200
  for e in resp["output"]["embeddings"]:
201
  embds[e["text_index"]] = e["embedding"]
202
  res.extend(embds)
203
+ token_count += self.total_token_count(resp)
204
  return np.array(res), token_count
205
  except Exception as e:
206
  raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
 
215
  text_type="query"
216
  )
217
  return np.array(resp["output"]["embeddings"][0]
218
+ ["embedding"]), self.total_token_count(resp)
219
  except Exception:
220
  raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
221
  return np.array([]), 0
 
241
  res = self.client.embeddings.create(input=txt,
242
  model=self.model_name)
243
  arr.append(res.data[0].embedding)
244
+ tks_num += self.total_token_count(res)
245
  return np.array(arr), tks_num
246
 
247
  def encode_queries(self, text):
248
  res = self.client.embeddings.create(input=text,
249
  model=self.model_name)
250
+ return np.array(res.data[0].embedding), self.total_token_count(res)
251
 
252
 
253
  class OllamaEmbed(Base):
 
330
  for i in range(0, len(texts), batch_size):
331
  res = self.client.embeddings.create(input=texts[i:i + batch_size], model=self.model_name)
332
  ress.extend([d.embedding for d in res.data])
333
+ total_tokens += self.total_token_count(res)
334
  return np.array(ress), total_tokens
335
 
336
  def encode_queries(self, text):
337
  res = self.client.embeddings.create(input=[text],
338
  model=self.model_name)
339
+ return np.array(res.data[0].embedding), self.total_token_count(res)
340
 
341
 
342
  class YoudaoEmbed(Base):
 
395
  }
396
  res = requests.post(self.base_url, headers=self.headers, json=data).json()
397
  ress.extend([d["embedding"] for d in res["data"]])
398
+ token_count += self.total_token_count(res)
399
  return np.array(ress), token_count
400
 
401
  def encode_queries(self, text):
 
459
  res = self.client.embeddings(input=texts[i:i + batch_size],
460
  model=self.model_name)
461
  ress.extend([d.embedding for d in res.data])
462
+ token_count += self.total_token_count(res)
463
  return np.array(ress), token_count
464
 
465
  def encode_queries(self, text):
466
  res = self.client.embeddings(input=[truncate(text, 8196)],
467
  model=self.model_name)
468
+ return np.array(res.data[0].embedding), self.total_token_count(res)
469
 
470
 
471
  class BedrockEmbed(Base):
 
577
  }
578
  res = requests.post(self.base_url, headers=self.headers, json=payload).json()
579
  ress.extend([d["embedding"] for d in res["data"]])
580
+ token_count += self.total_token_count(res)
581
  return np.array(ress), token_count
582
 
583
  def encode_queries(self, text):
 
689
  if "data" not in res or not isinstance(res["data"], list) or len(res["data"]) != len(texts_batch):
690
  raise ValueError(f"SILICONFLOWEmbed.encode got invalid response from {self.base_url}")
691
  ress.extend([d["embedding"] for d in res["data"]])
692
+ token_count += self.total_token_count(res)
693
  return np.array(ress), token_count
694
 
695
  def encode_queries(self, text):
 
701
  res = requests.post(self.base_url, json=payload, headers=self.headers).json()
702
  if "data" not in res or not isinstance(res["data"], list) or len(res["data"])!= 1:
703
  raise ValueError(f"SILICONFLOWEmbed.encode_queries got invalid response from {self.base_url}")
704
+ return np.array(res["data"][0]["embedding"]), self.total_token_count(res)
705
 
706
 
707
  class ReplicateEmbed(Base):
 
739
  res = self.client.do(model=self.model_name, texts=texts).body
740
  return (
741
  np.array([r["embedding"] for r in res["data"]]),
742
+ self.total_token_count(res),
743
  )
744
 
745
  def encode_queries(self, text):
746
  res = self.client.do(model=self.model_name, texts=[text]).body
747
  return (
748
  np.array([r["embedding"] for r in res["data"]]),
749
+ self.total_token_count(res),
750
  )
751
 
752
 
rag/llm/rerank_model.py CHANGED
@@ -42,6 +42,17 @@ class Base(ABC):
42
  def similarity(self, query: str, texts: list):
43
  raise NotImplementedError("Please implement encode method!")
44
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
  class DefaultRerank(Base):
47
  _model = None
@@ -115,7 +126,7 @@ class JinaRerank(Base):
115
  rank = np.zeros(len(texts), dtype=float)
116
  for d in res["results"]:
117
  rank[d["index"]] = d["relevance_score"]
118
- return rank, res["usage"]["total_tokens"]
119
 
120
 
121
  class YoudaoRerank(DefaultRerank):
@@ -417,7 +428,7 @@ class BaiduYiyanRerank(Base):
417
  rank = np.zeros(len(texts), dtype=float)
418
  for d in res["results"]:
419
  rank[d["index"]] = d["relevance_score"]
420
- return rank, res["usage"]["total_tokens"]
421
 
422
 
423
  class VoyageRerank(Base):
 
42
  def similarity(self, query: str, texts: list):
43
  raise NotImplementedError("Please implement encode method!")
44
 
45
+ def total_token_count(self, resp):
46
+ try:
47
+ return resp.usage.total_tokens
48
+ except Exception:
49
+ pass
50
+ try:
51
+ return resp["usage"]["total_tokens"]
52
+ except Exception:
53
+ pass
54
+ return 0
55
+
56
 
57
  class DefaultRerank(Base):
58
  _model = None
 
126
  rank = np.zeros(len(texts), dtype=float)
127
  for d in res["results"]:
128
  rank[d["index"]] = d["relevance_score"]
129
+ return rank, self.total_token_count(res)
130
 
131
 
132
  class YoudaoRerank(DefaultRerank):
 
428
  rank = np.zeros(len(texts), dtype=float)
429
  for d in res["results"]:
430
  rank[d["index"]] = d["relevance_score"]
431
+ return rank, self.total_token_count(res)
432
 
433
 
434
  class VoyageRerank(Base):
rag/nlp/search.py CHANGED
@@ -465,7 +465,7 @@ class Dealer:
465
  if not aggs:
466
  return False
467
  cnt = np.sum([c for _, c in aggs])
468
- tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / (all_tags.get(a, 0.0001)))) for a, c in aggs],
469
  key=lambda x: x[1] * -1)[:topn_tags]
470
  doc[TAG_FLD] = {a: c for a, c in tag_fea if c > 0}
471
  return True
@@ -481,6 +481,6 @@ class Dealer:
481
  if not aggs:
482
  return {}
483
  cnt = np.sum([c for _, c in aggs])
484
- tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / (all_tags.get(a, 0.0001)))) for a, c in aggs],
485
  key=lambda x: x[1] * -1)[:topn_tags]
486
  return {a: max(1, c) for a, c in tag_fea}
 
465
  if not aggs:
466
  return False
467
  cnt = np.sum([c for _, c in aggs])
468
+ tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / max(1e-6, all_tags.get(a, 0.0001)))) for a, c in aggs],
469
  key=lambda x: x[1] * -1)[:topn_tags]
470
  doc[TAG_FLD] = {a: c for a, c in tag_fea if c > 0}
471
  return True
 
481
  if not aggs:
482
  return {}
483
  cnt = np.sum([c for _, c in aggs])
484
+ tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / max(1e-6, all_tags.get(a, 0.0001)))) for a, c in aggs],
485
  key=lambda x: x[1] * -1)[:topn_tags]
486
  return {a: max(1, c) for a, c in tag_fea}
rag/raptor.py CHANGED
@@ -71,7 +71,7 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
71
  start, end = 0, len(chunks)
72
  if len(chunks) <= 1:
73
  return
74
- chunks = [(s, a) for s, a in chunks if len(a) > 0]
75
 
76
  def summarize(ck_idx, lock):
77
  nonlocal chunks
@@ -125,6 +125,8 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
125
  threads = []
126
  for c in range(n_clusters):
127
  ck_idx = [i + start for i in range(len(lbls)) if lbls[i] == c]
 
 
128
  threads.append(executor.submit(summarize, ck_idx, lock))
129
  wait(threads, return_when=ALL_COMPLETED)
130
  for th in threads:
 
71
  start, end = 0, len(chunks)
72
  if len(chunks) <= 1:
73
  return
74
+ chunks = [(s, a) for s, a in chunks if s and len(a) > 0]
75
 
76
  def summarize(ck_idx, lock):
77
  nonlocal chunks
 
125
  threads = []
126
  for c in range(n_clusters):
127
  ck_idx = [i + start for i in range(len(lbls)) if lbls[i] == c]
128
+ if not ck_idx:
129
+ continue
130
  threads.append(executor.submit(summarize, ck_idx, lock))
131
  wait(threads, return_when=ALL_COMPLETED)
132
  for th in threads:
rag/utils/es_conn.py CHANGED
@@ -336,7 +336,7 @@ class ESConnection(DocStoreConnection):
336
  for k, v in condition.items():
337
  if not isinstance(k, str) or not v:
338
  continue
339
- if k == "exist":
340
  bqry.filter.append(Q("exists", field=v))
341
  continue
342
  if isinstance(v, list):
 
336
  for k, v in condition.items():
337
  if not isinstance(k, str) or not v:
338
  continue
339
+ if k == "exists":
340
  bqry.filter.append(Q("exists", field=v))
341
  continue
342
  if isinstance(v, list):
rag/utils/infinity_conn.py CHANGED
@@ -44,8 +44,23 @@ from rag.utils.doc_store_conn import (
44
  logger = logging.getLogger('ragflow.infinity_conn')
45
 
46
 
47
- def equivalent_condition_to_str(condition: dict) -> str | None:
48
  assert "_id" not in condition
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  cond = list()
50
  for k, v in condition.items():
51
  if not isinstance(k, str) or k in ["kb_id"] or not v:
@@ -61,8 +76,15 @@ def equivalent_condition_to_str(condition: dict) -> str | None:
61
  strInCond = ", ".join(inCond)
62
  strInCond = f"{k} IN ({strInCond})"
63
  cond.append(strInCond)
 
 
 
 
 
64
  elif isinstance(v, str):
65
  cond.append(f"{k}='{v}'")
 
 
66
  else:
67
  cond.append(f"{k}={str(v)}")
68
  return " AND ".join(cond) if cond else "1=1"
@@ -273,15 +295,32 @@ class InfinityConnection(DocStoreConnection):
273
  for essential_field in ["id"]:
274
  if essential_field not in selectFields:
275
  selectFields.append(essential_field)
 
 
 
 
 
 
 
 
 
 
 
 
 
276
  if matchExprs:
277
- for essential_field in ["score()", PAGERANK_FLD]:
278
- selectFields.append(essential_field)
279
 
280
  # Prepare expressions common to all tables
281
  filter_cond = None
282
  filter_fulltext = ""
283
  if condition:
284
- filter_cond = equivalent_condition_to_str(condition)
 
 
 
 
285
  for matchExpr in matchExprs:
286
  if isinstance(matchExpr, MatchTextExpr):
287
  if filter_cond and "filter" not in matchExpr.extra_options:
@@ -364,7 +403,9 @@ class InfinityConnection(DocStoreConnection):
364
  self.connPool.release_conn(inf_conn)
365
  res = concat_dataframes(df_list, selectFields)
366
  if matchExprs:
367
- res = res.sort(pl.col("SCORE") + pl.col(PAGERANK_FLD), descending=True, maintain_order=True)
 
 
368
  res = res.limit(limit)
369
  logger.debug(f"INFINITY search final result: {str(res)}")
370
  return res, total_hits_count
@@ -419,12 +460,21 @@ class InfinityConnection(DocStoreConnection):
419
  self.createIdx(indexName, knowledgebaseId, vector_size)
420
  table_instance = db_instance.get_table(table_name)
421
 
 
 
 
 
 
 
 
 
 
422
  docs = copy.deepcopy(documents)
423
  for d in docs:
424
  assert "_id" not in d
425
  assert "id" in d
426
  for k, v in d.items():
427
- if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
428
  assert isinstance(v, list)
429
  d[k] = "###".join(v)
430
  elif re.search(r"_feas$", k):
@@ -439,6 +489,11 @@ class InfinityConnection(DocStoreConnection):
439
  elif k in ["page_num_int", "top_int"]:
440
  assert isinstance(v, list)
441
  d[k] = "_".join(f"{num:08x}" for num in v)
 
 
 
 
 
442
  ids = ["'{}'".format(d["id"]) for d in docs]
443
  str_ids = ", ".join(ids)
444
  str_filter = f"id IN ({str_ids})"
@@ -460,11 +515,11 @@ class InfinityConnection(DocStoreConnection):
460
  db_instance = inf_conn.get_database(self.dbName)
461
  table_name = f"{indexName}_{knowledgebaseId}"
462
  table_instance = db_instance.get_table(table_name)
463
- if "exist" in condition:
464
- del condition["exist"]
465
- filter = equivalent_condition_to_str(condition)
466
  for k, v in list(newValue.items()):
467
- if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
468
  assert isinstance(v, list)
469
  newValue[k] = "###".join(v)
470
  elif re.search(r"_feas$", k):
@@ -481,9 +536,11 @@ class InfinityConnection(DocStoreConnection):
481
  elif k in ["page_num_int", "top_int"]:
482
  assert isinstance(v, list)
483
  newValue[k] = "_".join(f"{num:08x}" for num in v)
484
- elif k == "remove" and v in [PAGERANK_FLD]:
485
  del newValue[k]
486
- newValue[v] = 0
 
 
487
  logger.debug(f"INFINITY update table {table_name}, filter {filter}, newValue {newValue}.")
488
  table_instance.update(filter, newValue)
489
  self.connPool.release_conn(inf_conn)
@@ -493,14 +550,14 @@ class InfinityConnection(DocStoreConnection):
493
  inf_conn = self.connPool.get_conn()
494
  db_instance = inf_conn.get_database(self.dbName)
495
  table_name = f"{indexName}_{knowledgebaseId}"
496
- filter = equivalent_condition_to_str(condition)
497
  try:
498
  table_instance = db_instance.get_table(table_name)
499
  except Exception:
500
  logger.warning(
501
- f"Skipped deleting `{filter}` from table {table_name} since the table doesn't exist."
502
  )
503
  return 0
 
504
  logger.debug(f"INFINITY delete table {table_name}, filter {filter}.")
505
  res = table_instance.delete(filter)
506
  self.connPool.release_conn(inf_conn)
@@ -538,7 +595,7 @@ class InfinityConnection(DocStoreConnection):
538
  v = res[fieldnm][i]
539
  if isinstance(v, Series):
540
  v = list(v)
541
- elif fieldnm in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
542
  assert isinstance(v, str)
543
  v = [kwd for kwd in v.split("###") if kwd]
544
  elif fieldnm == "position_int":
@@ -569,6 +626,8 @@ class InfinityConnection(DocStoreConnection):
569
  ans = {}
570
  num_rows = len(res)
571
  column_id = res["id"]
 
 
572
  for i in range(num_rows):
573
  id = column_id[i]
574
  txt = res[fieldnm][i]
 
44
  logger = logging.getLogger('ragflow.infinity_conn')
45
 
46
 
47
+ def equivalent_condition_to_str(condition: dict, table_instance=None) -> str | None:
48
  assert "_id" not in condition
49
+ clmns = {}
50
+ if table_instance:
51
+ for n, ty, de, _ in table_instance.show_columns().rows():
52
+ clmns[n] = (ty, de)
53
+
54
+ def exists(cln):
55
+ nonlocal clmns
56
+ assert cln in clmns, f"'{cln}' should be in '{clmns}'."
57
+ ty, de = clmns[cln]
58
+ if ty.lower().find("cha"):
59
+ if not de:
60
+ de = ""
61
+ return f" {cln}!='{de}' "
62
+ return f"{cln}!={de}"
63
+
64
  cond = list()
65
  for k, v in condition.items():
66
  if not isinstance(k, str) or k in ["kb_id"] or not v:
 
76
  strInCond = ", ".join(inCond)
77
  strInCond = f"{k} IN ({strInCond})"
78
  cond.append(strInCond)
79
+ elif k == "must_not":
80
+ if isinstance(v, dict):
81
+ for kk, vv in v.items():
82
+ if kk == "exists":
83
+ cond.append("NOT (%s)" % exists(vv))
84
  elif isinstance(v, str):
85
  cond.append(f"{k}='{v}'")
86
+ elif k == "exists":
87
+ cond.append(exists(v))
88
  else:
89
  cond.append(f"{k}={str(v)}")
90
  return " AND ".join(cond) if cond else "1=1"
 
295
  for essential_field in ["id"]:
296
  if essential_field not in selectFields:
297
  selectFields.append(essential_field)
298
+ score_func = ""
299
+ score_column = ""
300
+ for matchExpr in matchExprs:
301
+ if isinstance(matchExpr, MatchTextExpr):
302
+ score_func = "score()"
303
+ score_column = "SCORE"
304
+ break
305
+ if not score_func:
306
+ for matchExpr in matchExprs:
307
+ if isinstance(matchExpr, MatchDenseExpr):
308
+ score_func = "similarity()"
309
+ score_column = "SIMILARITY"
310
+ break
311
  if matchExprs:
312
+ selectFields.append(score_func)
313
+ selectFields.append(PAGERANK_FLD)
314
 
315
  # Prepare expressions common to all tables
316
  filter_cond = None
317
  filter_fulltext = ""
318
  if condition:
319
+ for indexName in indexNames:
320
+ table_name = f"{indexName}_{knowledgebaseIds[0]}"
321
+ filter_cond = equivalent_condition_to_str(condition, db_instance.get_table(table_name))
322
+ break
323
+
324
  for matchExpr in matchExprs:
325
  if isinstance(matchExpr, MatchTextExpr):
326
  if filter_cond and "filter" not in matchExpr.extra_options:
 
403
  self.connPool.release_conn(inf_conn)
404
  res = concat_dataframes(df_list, selectFields)
405
  if matchExprs:
406
+ res = res.sort(pl.col(score_column) + pl.col(PAGERANK_FLD), descending=True, maintain_order=True)
407
+ if score_column and score_column != "SCORE":
408
+ res = res.rename({score_column: "SCORE"})
409
  res = res.limit(limit)
410
  logger.debug(f"INFINITY search final result: {str(res)}")
411
  return res, total_hits_count
 
460
  self.createIdx(indexName, knowledgebaseId, vector_size)
461
  table_instance = db_instance.get_table(table_name)
462
 
463
+ # embedding fields can't have a default value....
464
+ embedding_clmns = []
465
+ clmns = table_instance.show_columns().rows()
466
+ for n, ty, _, _ in clmns:
467
+ r = re.search(r"Embedding\([a-z]+,([0-9]+)\)", ty)
468
+ if not r:
469
+ continue
470
+ embedding_clmns.append((n, int(r.group(1))))
471
+
472
  docs = copy.deepcopy(documents)
473
  for d in docs:
474
  assert "_id" not in d
475
  assert "id" in d
476
  for k, v in d.items():
477
+ if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
478
  assert isinstance(v, list)
479
  d[k] = "###".join(v)
480
  elif re.search(r"_feas$", k):
 
489
  elif k in ["page_num_int", "top_int"]:
490
  assert isinstance(v, list)
491
  d[k] = "_".join(f"{num:08x}" for num in v)
492
+
493
+ for n, vs in embedding_clmns:
494
+ if n in d:
495
+ continue
496
+ d[n] = [0] * vs
497
  ids = ["'{}'".format(d["id"]) for d in docs]
498
  str_ids = ", ".join(ids)
499
  str_filter = f"id IN ({str_ids})"
 
515
  db_instance = inf_conn.get_database(self.dbName)
516
  table_name = f"{indexName}_{knowledgebaseId}"
517
  table_instance = db_instance.get_table(table_name)
518
+ #if "exists" in condition:
519
+ # del condition["exists"]
520
+ filter = equivalent_condition_to_str(condition, table_instance)
521
  for k, v in list(newValue.items()):
522
+ if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
523
  assert isinstance(v, list)
524
  newValue[k] = "###".join(v)
525
  elif re.search(r"_feas$", k):
 
536
  elif k in ["page_num_int", "top_int"]:
537
  assert isinstance(v, list)
538
  newValue[k] = "_".join(f"{num:08x}" for num in v)
539
+ elif k == "remove":
540
  del newValue[k]
541
+ if v in [PAGERANK_FLD]:
542
+ newValue[v] = 0
543
+
544
  logger.debug(f"INFINITY update table {table_name}, filter {filter}, newValue {newValue}.")
545
  table_instance.update(filter, newValue)
546
  self.connPool.release_conn(inf_conn)
 
550
  inf_conn = self.connPool.get_conn()
551
  db_instance = inf_conn.get_database(self.dbName)
552
  table_name = f"{indexName}_{knowledgebaseId}"
 
553
  try:
554
  table_instance = db_instance.get_table(table_name)
555
  except Exception:
556
  logger.warning(
557
+ f"Skipped deleting from table {table_name} since the table doesn't exist."
558
  )
559
  return 0
560
+ filter = equivalent_condition_to_str(condition, table_instance)
561
  logger.debug(f"INFINITY delete table {table_name}, filter {filter}.")
562
  res = table_instance.delete(filter)
563
  self.connPool.release_conn(inf_conn)
 
595
  v = res[fieldnm][i]
596
  if isinstance(v, Series):
597
  v = list(v)
598
+ elif fieldnm in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
599
  assert isinstance(v, str)
600
  v = [kwd for kwd in v.split("###") if kwd]
601
  elif fieldnm == "position_int":
 
626
  ans = {}
627
  num_rows = len(res)
628
  column_id = res["id"]
629
+ if fieldnm not in res:
630
+ return {}
631
  for i in range(num_rows):
632
  id = column_id[i]
633
  txt = res[fieldnm][i]
uv.lock CHANGED
@@ -850,7 +850,7 @@ name = "coloredlogs"
850
  version = "15.0.1"
851
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
852
  dependencies = [
853
- { name = "humanfriendly", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
854
  ]
855
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cc/c7/eed8f27100517e8c0e6b923d5f0845d0cb99763da6fdee00478f91db7325/coloredlogs-15.0.1.tar.gz", hash = "sha256:7c991aa71a4577af2f82600d8f8f3a89f936baeaf9b50a9c197da014e5bf16b0", size = 278520 }
856
  wheels = [
@@ -1329,18 +1329,18 @@ name = "fastembed"
1329
  version = "0.3.6"
1330
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
1331
  dependencies = [
1332
- { name = "huggingface-hub", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1333
- { name = "loguru", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1334
- { name = "mmh3", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1335
- { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1336
- { name = "onnx", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1337
- { name = "onnxruntime", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1338
- { name = "pillow", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1339
- { name = "pystemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1340
- { name = "requests", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1341
- { name = "snowballstemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1342
- { name = "tokenizers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1343
- { name = "tqdm", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1344
  ]
1345
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ae/20/68a109c8def842ed47a2951873fb2d7d23ee296ef8c195aedbb735670fff/fastembed-0.3.6.tar.gz", hash = "sha256:c93c8ec99b8c008c2d192d6297866b8d70ec7ac8f5696b34eb5ea91f85efd15f", size = 35058 }
1346
  wheels = [
@@ -1352,17 +1352,17 @@ name = "fastembed-gpu"
1352
  version = "0.3.6"
1353
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
1354
  dependencies = [
1355
- { name = "huggingface-hub", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1356
- { name = "loguru", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1357
- { name = "mmh3", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1358
- { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1359
- { name = "onnxruntime-gpu", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1360
- { name = "pillow", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1361
- { name = "pystemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1362
- { name = "requests", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1363
- { name = "snowballstemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1364
- { name = "tokenizers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1365
- { name = "tqdm", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
1366
  ]
1367
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/da/07/7336c7f3d7ee47f33b407eeb50f5eeb152889de538a52a8f1cc637192816/fastembed_gpu-0.3.6.tar.gz", hash = "sha256:ee2de8918b142adbbf48caaffec0c492f864d73c073eea5a3dcd0e8c1041c50d", size = 35051 }
1368
  wheels = [
@@ -3424,8 +3424,8 @@ name = "onnx"
3424
  version = "1.17.0"
3425
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3426
  dependencies = [
3427
- { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3428
- { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3429
  ]
3430
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/9a/54/0e385c26bf230d223810a9c7d06628d954008a5e5e4b73ee26ef02327282/onnx-1.17.0.tar.gz", hash = "sha256:48ca1a91ff73c1d5e3ea2eef20ae5d0e709bb8a2355ed798ffc2169753013fd3", size = 12165120 }
3431
  wheels = [
@@ -3451,12 +3451,12 @@ name = "onnxruntime"
3451
  version = "1.19.2"
3452
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3453
  dependencies = [
3454
- { name = "coloredlogs", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3455
- { name = "flatbuffers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3456
- { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3457
- { name = "packaging", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3458
- { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3459
- { name = "sympy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3460
  ]
3461
  wheels = [
3462
  { url = "https://pypi.tuna.tsinghua.edu.cn/packages/39/18/272d3d7406909141d3c9943796e3e97cafa53f4342d9231c0cfd8cb05702/onnxruntime-1.19.2-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:84fa57369c06cadd3c2a538ae2a26d76d583e7c34bdecd5769d71ca5c0fc750e", size = 16776408 },
@@ -3481,12 +3481,12 @@ name = "onnxruntime-gpu"
3481
  version = "1.19.2"
3482
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3483
  dependencies = [
3484
- { name = "coloredlogs", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3485
- { name = "flatbuffers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3486
- { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3487
- { name = "packaging", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3488
- { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3489
- { name = "sympy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
3490
  ]
3491
  wheels = [
3492
  { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d0/9c/3fa310e0730643051eb88e884f19813a6c8b67d0fbafcda610d960e589db/onnxruntime_gpu-1.19.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a49740e079e7c5215830d30cde3df792e903df007aa0b0fd7aa797937061b27a", size = 226178508 },
@@ -4768,8 +4768,8 @@ dependencies = [
4768
  { name = "nltk" },
4769
  { name = "numpy" },
4770
  { name = "ollama" },
4771
- { name = "onnxruntime", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'" },
4772
- { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64'" },
4773
  { name = "openai" },
4774
  { name = "opencv-python" },
4775
  { name = "opencv-python-headless" },
@@ -4833,8 +4833,8 @@ dependencies = [
4833
  [package.optional-dependencies]
4834
  full = [
4835
  { name = "bcembedding" },
4836
- { name = "fastembed", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'" },
4837
- { name = "fastembed-gpu", marker = "platform_machine == 'x86_64'" },
4838
  { name = "flagembedding" },
4839
  { name = "torch" },
4840
  { name = "transformers" },
@@ -4870,8 +4870,8 @@ requires-dist = [
4870
  { name = "elastic-transport", specifier = "==8.12.0" },
4871
  { name = "elasticsearch", specifier = "==8.12.1" },
4872
  { name = "elasticsearch-dsl", specifier = "==8.12.0" },
4873
- { name = "fastembed", marker = "(platform_machine == 'arm64' and extra == 'full') or (sys_platform == 'darwin' and extra == 'full')", specifier = ">=0.3.6,<0.4.0" },
4874
- { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and extra == 'full'", specifier = ">=0.3.6,<0.4.0" },
4875
  { name = "fasttext", specifier = "==0.9.3" },
4876
  { name = "filelock", specifier = "==3.15.4" },
4877
  { name = "flagembedding", marker = "extra == 'full'", specifier = "==1.2.10" },
@@ -4900,8 +4900,8 @@ requires-dist = [
4900
  { name = "nltk", specifier = "==3.9.1" },
4901
  { name = "numpy", specifier = ">=1.26.0,<2.0.0" },
4902
  { name = "ollama", specifier = "==0.2.1" },
4903
- { name = "onnxruntime", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'", specifier = "==1.19.2" },
4904
- { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64'", specifier = "==1.19.2" },
4905
  { name = "openai", specifier = "==1.45.0" },
4906
  { name = "opencv-python", specifier = "==4.10.0.84" },
4907
  { name = "opencv-python-headless", specifier = "==4.10.0.84" },
 
850
  version = "15.0.1"
851
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
852
  dependencies = [
853
+ { name = "humanfriendly" },
854
  ]
855
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cc/c7/eed8f27100517e8c0e6b923d5f0845d0cb99763da6fdee00478f91db7325/coloredlogs-15.0.1.tar.gz", hash = "sha256:7c991aa71a4577af2f82600d8f8f3a89f936baeaf9b50a9c197da014e5bf16b0", size = 278520 }
856
  wheels = [
 
1329
  version = "0.3.6"
1330
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
1331
  dependencies = [
1332
+ { name = "huggingface-hub" },
1333
+ { name = "loguru" },
1334
+ { name = "mmh3" },
1335
+ { name = "numpy" },
1336
+ { name = "onnx" },
1337
+ { name = "onnxruntime" },
1338
+ { name = "pillow" },
1339
+ { name = "pystemmer" },
1340
+ { name = "requests" },
1341
+ { name = "snowballstemmer" },
1342
+ { name = "tokenizers" },
1343
+ { name = "tqdm" },
1344
  ]
1345
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ae/20/68a109c8def842ed47a2951873fb2d7d23ee296ef8c195aedbb735670fff/fastembed-0.3.6.tar.gz", hash = "sha256:c93c8ec99b8c008c2d192d6297866b8d70ec7ac8f5696b34eb5ea91f85efd15f", size = 35058 }
1346
  wheels = [
 
1352
  version = "0.3.6"
1353
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
1354
  dependencies = [
1355
+ { name = "huggingface-hub", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1356
+ { name = "loguru", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1357
+ { name = "mmh3", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1358
+ { name = "numpy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1359
+ { name = "onnxruntime-gpu", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1360
+ { name = "pillow", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1361
+ { name = "pystemmer", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1362
+ { name = "requests", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1363
+ { name = "snowballstemmer", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1364
+ { name = "tokenizers", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1365
+ { name = "tqdm", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
1366
  ]
1367
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/da/07/7336c7f3d7ee47f33b407eeb50f5eeb152889de538a52a8f1cc637192816/fastembed_gpu-0.3.6.tar.gz", hash = "sha256:ee2de8918b142adbbf48caaffec0c492f864d73c073eea5a3dcd0e8c1041c50d", size = 35051 }
1368
  wheels = [
 
3424
  version = "1.17.0"
3425
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3426
  dependencies = [
3427
+ { name = "numpy" },
3428
+ { name = "protobuf" },
3429
  ]
3430
  sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/9a/54/0e385c26bf230d223810a9c7d06628d954008a5e5e4b73ee26ef02327282/onnx-1.17.0.tar.gz", hash = "sha256:48ca1a91ff73c1d5e3ea2eef20ae5d0e709bb8a2355ed798ffc2169753013fd3", size = 12165120 }
3431
  wheels = [
 
3451
  version = "1.19.2"
3452
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3453
  dependencies = [
3454
+ { name = "coloredlogs" },
3455
+ { name = "flatbuffers" },
3456
+ { name = "numpy" },
3457
+ { name = "packaging" },
3458
+ { name = "protobuf" },
3459
+ { name = "sympy" },
3460
  ]
3461
  wheels = [
3462
  { url = "https://pypi.tuna.tsinghua.edu.cn/packages/39/18/272d3d7406909141d3c9943796e3e97cafa53f4342d9231c0cfd8cb05702/onnxruntime-1.19.2-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:84fa57369c06cadd3c2a538ae2a26d76d583e7c34bdecd5769d71ca5c0fc750e", size = 16776408 },
 
3481
  version = "1.19.2"
3482
  source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
3483
  dependencies = [
3484
+ { name = "coloredlogs", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3485
+ { name = "flatbuffers", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3486
+ { name = "numpy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3487
+ { name = "packaging", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3488
+ { name = "protobuf", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3489
+ { name = "sympy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
3490
  ]
3491
  wheels = [
3492
  { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d0/9c/3fa310e0730643051eb88e884f19813a6c8b67d0fbafcda610d960e589db/onnxruntime_gpu-1.19.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a49740e079e7c5215830d30cde3df792e903df007aa0b0fd7aa797937061b27a", size = 226178508 },
 
4768
  { name = "nltk" },
4769
  { name = "numpy" },
4770
  { name = "ollama" },
4771
+ { name = "onnxruntime", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'" },
4772
+ { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
4773
  { name = "openai" },
4774
  { name = "opencv-python" },
4775
  { name = "opencv-python-headless" },
 
4833
  [package.optional-dependencies]
4834
  full = [
4835
  { name = "bcembedding" },
4836
+ { name = "fastembed", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'" },
4837
+ { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
4838
  { name = "flagembedding" },
4839
  { name = "torch" },
4840
  { name = "transformers" },
 
4870
  { name = "elastic-transport", specifier = "==8.12.0" },
4871
  { name = "elasticsearch", specifier = "==8.12.1" },
4872
  { name = "elasticsearch-dsl", specifier = "==8.12.0" },
4873
+ { name = "fastembed", marker = "(platform_machine != 'x86_64' and extra == 'full') or (sys_platform == 'darwin' and extra == 'full')", specifier = ">=0.3.6,<0.4.0" },
4874
+ { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin' and extra == 'full'", specifier = ">=0.3.6,<0.4.0" },
4875
  { name = "fasttext", specifier = "==0.9.3" },
4876
  { name = "filelock", specifier = "==3.15.4" },
4877
  { name = "flagembedding", marker = "extra == 'full'", specifier = "==1.2.10" },
 
4900
  { name = "nltk", specifier = "==3.9.1" },
4901
  { name = "numpy", specifier = ">=1.26.0,<2.0.0" },
4902
  { name = "ollama", specifier = "==0.2.1" },
4903
+ { name = "onnxruntime", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'", specifier = "==1.19.2" },
4904
+ { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'", specifier = "==1.19.2" },
4905
  { name = "openai", specifier = "==1.45.0" },
4906
  { name = "opencv-python", specifier = "==4.10.0.84" },
4907
  { name = "opencv-python-headless", specifier = "==4.10.0.84" },
web/src/pages/user-setting/components/setting-title/index.tsx CHANGED
@@ -27,7 +27,10 @@ const SettingTitle = ({
27
  </div>
28
  {showRightButton && (
29
  <Button type={'primary'} onClick={clickButton}>
30
- <SettingOutlined></SettingOutlined> {t('systemModelSettings')}
 
 
 
31
  </Button>
32
  )}
33
  </Flex>
 
27
  </div>
28
  {showRightButton && (
29
  <Button type={'primary'} onClick={clickButton}>
30
+ <Flex align="center" gap={4}>
31
+ <SettingOutlined />
32
+ {t('systemModelSettings')}
33
+ </Flex>
34
  </Button>
35
  )}
36
  </Flex>
web/src/pages/user-setting/setting-model/index.tsx CHANGED
@@ -92,27 +92,31 @@ const ModelCard = ({ item, clickApiKey }: IModelCardProps) => {
92
  <Col span={12} className={styles.factoryOperationWrapper}>
93
  <Space size={'middle'}>
94
  <Button onClick={handleApiKeyClick}>
95
- {isLocalLlmFactory(item.name) ||
96
- item.name === 'VolcEngine' ||
97
- item.name === 'Tencent Hunyuan' ||
98
- item.name === 'XunFei Spark' ||
99
- item.name === 'BaiduYiyan' ||
100
- item.name === 'Fish Audio' ||
101
- item.name === 'Tencent Cloud' ||
102
- item.name === 'Google Cloud' ||
103
- item.name === 'Azure OpenAI'
104
- ? t('addTheModel')
105
- : 'API-Key'}
106
- <SettingOutlined />
 
 
107
  </Button>
108
  <Button onClick={handleShowMoreClick}>
109
- <Flex gap={'small'}>
110
  {t('showMoreModels')}
111
  <MoreModelIcon />
112
  </Flex>
113
  </Button>
114
  <Button type={'text'} onClick={handleDeleteFactory}>
115
- <CloseCircleOutlined style={{ color: '#D92D20' }} />
 
 
116
  </Button>
117
  </Space>
118
  </Col>
 
92
  <Col span={12} className={styles.factoryOperationWrapper}>
93
  <Space size={'middle'}>
94
  <Button onClick={handleApiKeyClick}>
95
+ <Flex align="center" gap={4}>
96
+ {isLocalLlmFactory(item.name) ||
97
+ item.name === 'VolcEngine' ||
98
+ item.name === 'Tencent Hunyuan' ||
99
+ item.name === 'XunFei Spark' ||
100
+ item.name === 'BaiduYiyan' ||
101
+ item.name === 'Fish Audio' ||
102
+ item.name === 'Tencent Cloud' ||
103
+ item.name === 'Google Cloud' ||
104
+ item.name === 'Azure OpenAI'
105
+ ? t('addTheModel')
106
+ : 'API-Key'}
107
+ <SettingOutlined />
108
+ </Flex>
109
  </Button>
110
  <Button onClick={handleShowMoreClick}>
111
+ <Flex align="center" gap={4}>
112
  {t('showMoreModels')}
113
  <MoreModelIcon />
114
  </Flex>
115
  </Button>
116
  <Button type={'text'} onClick={handleDeleteFactory}>
117
+ <Flex align="center">
118
+ <CloseCircleOutlined style={{ color: '#D92D20' }} />
119
+ </Flex>
120
  </Button>
121
  </Space>
122
  </Col>