Improved ollama doc (#3787)
Browse files### What problem does this PR solve?
Improved ollama doc. Close #3723
### Type of change
- [x] Documentation Update
- docs/guides/deploy_local_llm.mdx +40 -37
docs/guides/deploy_local_llm.mdx
CHANGED
@@ -17,7 +17,7 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
|
|
17 |
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
|
18 |
:::
|
19 |
|
20 |
-
## Deploy
|
21 |
|
22 |
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
|
23 |
|
@@ -27,35 +27,54 @@ This user guide does not intend to cover much of the installation or configurati
|
|
27 |
- For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library).
|
28 |
:::
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
Ensure
|
35 |
-
|
36 |
```bash
|
37 |
-
sudo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
### 2. Ensure Ollama is accessible
|
40 |
|
41 |
-
|
42 |
-
|
43 |
```bash
|
|
|
|
|
44 |
Ollama is running
|
45 |
```
|
46 |
|
47 |
-
|
48 |
-
|
49 |
```bash
|
50 |
-
|
|
|
51 |
```
|
52 |
-
<details>
|
53 |
-
<summary>If your Ollama is installed through Docker, run the following instead:</summary>
|
54 |
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
|
|
59 |
|
60 |
### 4. Add Ollama
|
61 |
|
@@ -68,26 +87,10 @@ In RAGFlow, click on your logo on the top right of the page **>** **Model Provid
|
|
68 |
|
69 |
In the popup window, complete basic settings for Ollama:
|
70 |
|
71 |
-
1.
|
72 |
-
2. Ensure that the
|
73 |
-
3.
|
74 |
-
4. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
|
75 |
|
76 |
-
:::caution NOTE
|
77 |
-
- If RAGFlow is in Docker and Ollama runs on the same host machine, use `http://host.docker.internal:11434` as base URL.
|
78 |
-
- If your Ollama and RAGFlow run on the same machine, use `http://localhost:11434` as base URL.
|
79 |
-
- If your Ollama runs on a different machine from RAGFlow, use `http://<IP_OF_OLLAMA_MACHINE>:11434` as base URL.
|
80 |
-
:::
|
81 |
-
|
82 |
-
:::danger WARNING
|
83 |
-
If your Ollama runs on a different machine, you may also need to set the `OLLAMA_HOST` environment variable to `0.0.0.0` in **ollama.service** (Note that this is *NOT* the base URL):
|
84 |
-
|
85 |
-
```bash
|
86 |
-
Environment="OLLAMA_HOST=0.0.0.0"
|
87 |
-
```
|
88 |
-
|
89 |
-
See [this guide](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server) for more information.
|
90 |
-
:::
|
91 |
|
92 |
:::caution WARNING
|
93 |
Improper base URL settings will trigger the following error:
|
@@ -100,7 +103,7 @@ Max retries exceeded with url: /api/chat (Caused by NewConnectionError('<urllib3
|
|
100 |
|
101 |
Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model:
|
102 |
|
103 |
-
*You should now be able to find **llama3** from the dropdown list under **Chat model**.*
|
104 |
|
105 |
> If your local model is an embedding model, you should find your local model under **Embedding model**.
|
106 |
|
|
|
17 |
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
|
18 |
:::
|
19 |
|
20 |
+
## Deploy local models using Ollama
|
21 |
|
22 |
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
|
23 |
|
|
|
27 |
- For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library).
|
28 |
:::
|
29 |
|
30 |
+
### 1. Deploy ollama using docker
|
31 |
|
32 |
+
```bash
|
33 |
+
sudo docker run --name ollama -p 11434:11434 ollama/ollama
|
34 |
+
time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)"
|
35 |
+
time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
|
36 |
+
```
|
37 |
|
38 |
+
Ensure ollama is listening on all IP address:
|
|
|
39 |
```bash
|
40 |
+
sudo ss -tunlp|grep 11434
|
41 |
+
tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4))
|
42 |
+
tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4))
|
43 |
+
```
|
44 |
+
|
45 |
+
Pull models as you need. It's recommended to start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model):
|
46 |
+
```bash
|
47 |
+
sudo docker exec ollama ollama pull llama3.2
|
48 |
+
pulling dde5aa3fc5ff... 100% ββββββββββββββββββ 2.0 GB
|
49 |
+
success
|
50 |
```
|
51 |
+
|
52 |
+
```bash
|
53 |
+
sudo docker exec ollama ollama pull bge-m3
|
54 |
+
pulling daec91ffb5dd... 100% ββββββββββββββββββ 1.2 GB
|
55 |
+
success
|
56 |
+
```
|
57 |
+
|
58 |
### 2. Ensure Ollama is accessible
|
59 |
|
60 |
+
If RAGFlow runs in Docker and Ollama runs on the same host machine, check if ollama is accessiable from inside the RAGFlow container:
|
|
|
61 |
```bash
|
62 |
+
sudo docker exec -it ragflow-server bash
|
63 |
+
root@8136b8c3e914:/ragflow# curl http://host.docker.internal:11434/
|
64 |
Ollama is running
|
65 |
```
|
66 |
|
67 |
+
If RAGFlow runs from source code and Ollama runs on the same host machine, check if ollama is accessiable from RAGFlow host machine:
|
|
|
68 |
```bash
|
69 |
+
curl http://localhost:11434/
|
70 |
+
Ollama is running
|
71 |
```
|
|
|
|
|
72 |
|
73 |
+
If RAGFlow and Ollama run on different machines, check if ollama is accessiable from RAGFlow host machine:
|
74 |
+
```bash
|
75 |
+
curl http://${IP_OF_OLLAMA_MACHINE}:11434/
|
76 |
+
Ollama is running
|
77 |
+
```
|
78 |
|
79 |
### 4. Add Ollama
|
80 |
|
|
|
87 |
|
88 |
In the popup window, complete basic settings for Ollama:
|
89 |
|
90 |
+
1. Ensure model name and type match those been pulled at step 1, For example, (`llama3.2`, `chat`), (`bge-m3`, `embedding`).
|
91 |
+
2. Ensure that the base URL match which been determined at step 2.
|
92 |
+
3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
|
|
|
93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
94 |
|
95 |
:::caution WARNING
|
96 |
Improper base URL settings will trigger the following error:
|
|
|
103 |
|
104 |
Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model:
|
105 |
|
106 |
+
*You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.*
|
107 |
|
108 |
> If your local model is an embedding model, you should find your local model under **Embedding model**.
|
109 |
|