zhichyu commited on
Commit
e3fc972
Β·
1 Parent(s): e2bab34

Improved ollama doc (#3787)

Browse files

### What problem does this PR solve?

Improved ollama doc. Close #3723

### Type of change

- [x] Documentation Update

Files changed (1) hide show
  1. docs/guides/deploy_local_llm.mdx +40 -37
docs/guides/deploy_local_llm.mdx CHANGED
@@ -17,7 +17,7 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
17
  This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
18
  :::
19
 
20
- ## Deploy a local model using Ollama
21
 
22
  [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
23
 
@@ -27,35 +27,54 @@ This user guide does not intend to cover much of the installation or configurati
27
  - For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library).
28
  :::
29
 
30
- To deploy a local model, e.g., **Llama3**, using Ollama:
31
 
32
- ### 1. Check firewall settings
 
 
 
 
33
 
34
- Ensure that your host machine's firewall allows inbound connections on port 11434. For example:
35
-
36
  ```bash
37
- sudo ufw allow 11434/tcp
 
 
 
 
 
 
 
 
 
38
  ```
 
 
 
 
 
 
 
39
  ### 2. Ensure Ollama is accessible
40
 
41
- Restart system and use curl or your web browser to check if the service URL of your Ollama service at `http://localhost:11434` is accessible.
42
-
43
  ```bash
 
 
44
  Ollama is running
45
  ```
46
 
47
- ### 3. Run your local model
48
-
49
  ```bash
50
- ollama run llama3
 
51
  ```
52
- <details>
53
- <summary>If your Ollama is installed through Docker, run the following instead:</summary>
54
 
55
- ```bash
56
- docker exec -it ollama ollama run llama3
57
- ```
58
- </details>
 
59
 
60
  ### 4. Add Ollama
61
 
@@ -68,26 +87,10 @@ In RAGFlow, click on your logo on the top right of the page **>** **Model Provid
68
 
69
  In the popup window, complete basic settings for Ollama:
70
 
71
- 1. Because **llama3** is a chat model, choose **chat** as the model type.
72
- 2. Ensure that the model name you enter here *precisely* matches the name of the local model you are running with Ollama.
73
- 3. Ensure that the base URL you enter is accessible to RAGFlow.
74
- 4. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
75
 
76
- :::caution NOTE
77
- - If RAGFlow is in Docker and Ollama runs on the same host machine, use `http://host.docker.internal:11434` as base URL.
78
- - If your Ollama and RAGFlow run on the same machine, use `http://localhost:11434` as base URL.
79
- - If your Ollama runs on a different machine from RAGFlow, use `http://<IP_OF_OLLAMA_MACHINE>:11434` as base URL.
80
- :::
81
-
82
- :::danger WARNING
83
- If your Ollama runs on a different machine, you may also need to set the `OLLAMA_HOST` environment variable to `0.0.0.0` in **ollama.service** (Note that this is *NOT* the base URL):
84
-
85
- ```bash
86
- Environment="OLLAMA_HOST=0.0.0.0"
87
- ```
88
-
89
- See [this guide](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server) for more information.
90
- :::
91
 
92
  :::caution WARNING
93
  Improper base URL settings will trigger the following error:
@@ -100,7 +103,7 @@ Max retries exceeded with url: /api/chat (Caused by NewConnectionError('<urllib3
100
 
101
  Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model:
102
 
103
- *You should now be able to find **llama3** from the dropdown list under **Chat model**.*
104
 
105
  > If your local model is an embedding model, you should find your local model under **Embedding model**.
106
 
 
17
  This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
18
  :::
19
 
20
+ ## Deploy local models using Ollama
21
 
22
  [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
23
 
 
27
  - For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library).
28
  :::
29
 
30
+ ### 1. Deploy ollama using docker
31
 
32
+ ```bash
33
+ sudo docker run --name ollama -p 11434:11434 ollama/ollama
34
+ time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)"
35
+ time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
36
+ ```
37
 
38
+ Ensure ollama is listening on all IP address:
 
39
  ```bash
40
+ sudo ss -tunlp|grep 11434
41
+ tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4))
42
+ tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4))
43
+ ```
44
+
45
+ Pull models as you need. It's recommended to start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model):
46
+ ```bash
47
+ sudo docker exec ollama ollama pull llama3.2
48
+ pulling dde5aa3fc5ff... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 2.0 GB
49
+ success
50
  ```
51
+
52
+ ```bash
53
+ sudo docker exec ollama ollama pull bge-m3
54
+ pulling daec91ffb5dd... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 1.2 GB
55
+ success
56
+ ```
57
+
58
  ### 2. Ensure Ollama is accessible
59
 
60
+ If RAGFlow runs in Docker and Ollama runs on the same host machine, check if ollama is accessiable from inside the RAGFlow container:
 
61
  ```bash
62
+ sudo docker exec -it ragflow-server bash
63
+ root@8136b8c3e914:/ragflow# curl http://host.docker.internal:11434/
64
  Ollama is running
65
  ```
66
 
67
+ If RAGFlow runs from source code and Ollama runs on the same host machine, check if ollama is accessiable from RAGFlow host machine:
 
68
  ```bash
69
+ curl http://localhost:11434/
70
+ Ollama is running
71
  ```
 
 
72
 
73
+ If RAGFlow and Ollama run on different machines, check if ollama is accessiable from RAGFlow host machine:
74
+ ```bash
75
+ curl http://${IP_OF_OLLAMA_MACHINE}:11434/
76
+ Ollama is running
77
+ ```
78
 
79
  ### 4. Add Ollama
80
 
 
87
 
88
  In the popup window, complete basic settings for Ollama:
89
 
90
+ 1. Ensure model name and type match those been pulled at step 1, For example, (`llama3.2`, `chat`), (`bge-m3`, `embedding`).
91
+ 2. Ensure that the base URL match which been determined at step 2.
92
+ 3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
 
93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
 
95
  :::caution WARNING
96
  Improper base URL settings will trigger the following error:
 
103
 
104
  Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model:
105
 
106
+ *You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.*
107
 
108
  > If your local model is an embedding model, you should find your local model under **Embedding model**.
109