Spaces:
Runtime error
Runtime error
File size: 6,766 Bytes
4670a90 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 |
# [ChatDocs](https://github.com/marella/chatdocs) [](https://pypi.org/project/chatdocs/) [](https://github.com/marella/chatdocs/actions/workflows/tests.yml)
Chat with your documents offline using AI. No data leaves your system. Internet connection is only required to install the tool and download the AI models. It is based on [PrivateGPT](https://github.com/imartinez/privateGPT) but has more features.

- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [Configuration](#configuration)
- [GPU](#gpu)
## Features
- Supports GGML models via [C Transformers](https://github.com/marella/ctransformers)
- Supports π€ Transformers models
- Supports GPTQ models
- Web UI
- GPU support
- Highly configurable via `chatdocs.yml`
<details>
<summary><strong>Show supported document types</strong></summary><br>
| Extension | Format |
| :-------------- | :----------------------------- |
| `.csv` | CSV |
| `.docx`, `.doc` | Word Document |
| `.enex` | EverNote |
| `.eml` | Email |
| `.epub` | EPub |
| `.html` | HTML |
| `.md` | Markdown |
| `.msg` | Outlook Message |
| `.odt` | Open Document Text |
| `.pdf` | Portable Document Format (PDF) |
| `.pptx`, `.ppt` | PowerPoint Document |
| `.txt` | Text file (UTF-8) |
</details>
## Installation
Install the tool using:
```sh
pip install chatdocs
```
Download the AI models using:
```sh
chatdocs download
```
Now it can be run offline without internet connection.
## Usage
Add a directory containing documents to chat with using:
```sh
chatdocs add /path/to/documents
```
> The processed documents will be stored in `db` directory by default.
Chat with your documents using:
```sh
chatdocs ui
```
Open http://localhost:5000 in your browser to access the web UI.
It also has a nice command-line interface:
```sh
chatdocs chat
```
<details>
<summary><strong>Show preview</strong></summary><br>

</details>
## Configuration
All the configuration options can be changed using the `chatdocs.yml` config file. Create a `chatdocs.yml` file in some directory and run all commands from that directory. For reference, see the default [`chatdocs.yml`](https://github.com/marella/chatdocs/blob/main/chatdocs/data/chatdocs.yml) file.
You don't have to copy the entire file, just add the config options you want to change as it will be merged with the default config. For example, see [`tests/fixtures/chatdocs.yml`](https://github.com/marella/chatdocs/blob/main/tests/fixtures/chatdocs.yml) which changes only some of the config options.
### Embeddings
To change the embeddings model, add and change the following in your `chatdocs.yml`:
```yml
embeddings:
model: hkunlp/instructor-large
```
> **Note:** When you change the embeddings model, delete the `db` directory and add documents again.
### C Transformers
To change the C Transformers GGML model, add and change the following in your `chatdocs.yml`:
```yml
ctransformers:
model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML
model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin
model_type: llama
```
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it.
You can also use an existing local model file:
```yml
ctransformers:
model: /path/to/ggml-model.bin
model_type: llama
```
### π€ Transformers
To use π€ Transformers models, add the following to your `chatdocs.yml`:
```yml
llm: huggingface
```
To change the π€ Transformers model, add and change the following in your `chatdocs.yml`:
```yml
huggingface:
model: TheBloke/Wizard-Vicuna-7B-Uncensored-HF
```
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it.
### GPTQ
To use GPTQ models, install the `auto-gptq` package using:
```sh
pip install chatdocs[gptq]
```
and add the following to your `chatdocs.yml`:
```yml
llm: gptq
```
To change the GPTQ model, add and change the following in your `chatdocs.yml`:
```yml
gptq:
model: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ
model_file: Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors
```
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it.
## GPU
### Embeddings
To enable GPU (CUDA) support for the embeddings model, add the following to your `chatdocs.yml`:
```yml
embeddings:
model_kwargs:
device: cuda
```
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/).
### C Transformers
> **Note:** Currently only LLaMA GGML models have GPU support.
To enable GPU (CUDA) support for the C Transformers GGML model, add the following to your `chatdocs.yml`:
```yml
ctransformers:
config:
gpu_layers: 50
```
You should also reinstall the `ctransformers` package with CUDA enabled:
```sh
pip uninstall ctransformers --yes
CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers
```
<details>
<summary><strong>Show commands for Windows</strong></summary><br>
On Windows PowerShell run:
```sh
$env:CT_CUBLAS=1
pip uninstall ctransformers --yes
pip install ctransformers --no-binary ctransformers
```
On Windows Command Prompt run:
```sh
set CT_CUBLAS=1
pip uninstall ctransformers --yes
pip install ctransformers --no-binary ctransformers
```
</details>
### π€ Transformers
To enable GPU (CUDA) support for the π€ Transformers model, add the following to your `chatdocs.yml`:
```yml
huggingface:
device: 0
```
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/).
### GPTQ
To enable GPU (CUDA) support for the GPTQ model, add the following to your `chatdocs.yml`:
```yml
gptq:
device: 0
```
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/).
After installing PyTorch with CUDA enabled, you should also reinstall the `auto-gptq` package:
```sh
pip uninstall auto-gptq --yes
pip install chatdocs[gptq]
```
## License
[MIT](https://github.com/marella/chatdocs/blob/main/LICENSE)
|