Spaces:
Runtime error
Runtime error
# [ChatDocs](https://github.com/marella/chatdocs) [](https://pypi.org/project/chatdocs/) [](https://github.com/marella/chatdocs/actions/workflows/tests.yml) | |
Chat with your documents offline using AI. No data leaves your system. Internet connection is only required to install the tool and download the AI models. It is based on [PrivateGPT](https://github.com/imartinez/privateGPT) but has more features. | |
 | |
- [Features](#features) | |
- [Installation](#installation) | |
- [Usage](#usage) | |
- [Configuration](#configuration) | |
- [GPU](#gpu) | |
## Features | |
- Supports GGML models via [C Transformers](https://github.com/marella/ctransformers) | |
- Supports π€ Transformers models | |
- Supports GPTQ models | |
- Web UI | |
- GPU support | |
- Highly configurable via `chatdocs.yml` | |
<details> | |
<summary><strong>Show supported document types</strong></summary><br> | |
| Extension | Format | | |
| :-------------- | :----------------------------- | | |
| `.csv` | CSV | | |
| `.docx`, `.doc` | Word Document | | |
| `.enex` | EverNote | | |
| `.eml` | Email | | |
| `.epub` | EPub | | |
| `.html` | HTML | | |
| `.md` | Markdown | | |
| `.msg` | Outlook Message | | |
| `.odt` | Open Document Text | | |
| `.pdf` | Portable Document Format (PDF) | | |
| `.pptx`, `.ppt` | PowerPoint Document | | |
| `.txt` | Text file (UTF-8) | | |
</details> | |
## Installation | |
Install the tool using: | |
```sh | |
pip install chatdocs | |
``` | |
Download the AI models using: | |
```sh | |
chatdocs download | |
``` | |
Now it can be run offline without internet connection. | |
## Usage | |
Add a directory containing documents to chat with using: | |
```sh | |
chatdocs add /path/to/documents | |
``` | |
> The processed documents will be stored in `db` directory by default. | |
Chat with your documents using: | |
```sh | |
chatdocs ui | |
``` | |
Open http://localhost:5000 in your browser to access the web UI. | |
It also has a nice command-line interface: | |
```sh | |
chatdocs chat | |
``` | |
<details> | |
<summary><strong>Show preview</strong></summary><br> | |
 | |
</details> | |
## Configuration | |
All the configuration options can be changed using the `chatdocs.yml` config file. Create a `chatdocs.yml` file in some directory and run all commands from that directory. For reference, see the default [`chatdocs.yml`](https://github.com/marella/chatdocs/blob/main/chatdocs/data/chatdocs.yml) file. | |
You don't have to copy the entire file, just add the config options you want to change as it will be merged with the default config. For example, see [`tests/fixtures/chatdocs.yml`](https://github.com/marella/chatdocs/blob/main/tests/fixtures/chatdocs.yml) which changes only some of the config options. | |
### Embeddings | |
To change the embeddings model, add and change the following in your `chatdocs.yml`: | |
```yml | |
embeddings: | |
model: hkunlp/instructor-large | |
``` | |
> **Note:** When you change the embeddings model, delete the `db` directory and add documents again. | |
### C Transformers | |
To change the C Transformers GGML model, add and change the following in your `chatdocs.yml`: | |
```yml | |
ctransformers: | |
model: TheBloke/Wizard-Vicuna-7B-Uncensored-GGML | |
model_file: Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin | |
model_type: llama | |
``` | |
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it. | |
You can also use an existing local model file: | |
```yml | |
ctransformers: | |
model: /path/to/ggml-model.bin | |
model_type: llama | |
``` | |
### π€ Transformers | |
To use π€ Transformers models, add the following to your `chatdocs.yml`: | |
```yml | |
llm: huggingface | |
``` | |
To change the π€ Transformers model, add and change the following in your `chatdocs.yml`: | |
```yml | |
huggingface: | |
model: TheBloke/Wizard-Vicuna-7B-Uncensored-HF | |
``` | |
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it. | |
### GPTQ | |
To use GPTQ models, install the `auto-gptq` package using: | |
```sh | |
pip install chatdocs[gptq] | |
``` | |
and add the following to your `chatdocs.yml`: | |
```yml | |
llm: gptq | |
``` | |
To change the GPTQ model, add and change the following in your `chatdocs.yml`: | |
```yml | |
gptq: | |
model: TheBloke/Wizard-Vicuna-7B-Uncensored-GPTQ | |
model_file: Wizard-Vicuna-7B-Uncensored-GPTQ-4bit-128g.no-act-order.safetensors | |
``` | |
> **Note:** When you add a new model for the first time, run `chatdocs download` to download the model before using it. | |
## GPU | |
### Embeddings | |
To enable GPU (CUDA) support for the embeddings model, add the following to your `chatdocs.yml`: | |
```yml | |
embeddings: | |
model_kwargs: | |
device: cuda | |
``` | |
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/). | |
### C Transformers | |
> **Note:** Currently only LLaMA GGML models have GPU support. | |
To enable GPU (CUDA) support for the C Transformers GGML model, add the following to your `chatdocs.yml`: | |
```yml | |
ctransformers: | |
config: | |
gpu_layers: 50 | |
``` | |
You should also reinstall the `ctransformers` package with CUDA enabled: | |
```sh | |
pip uninstall ctransformers --yes | |
CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers | |
``` | |
<details> | |
<summary><strong>Show commands for Windows</strong></summary><br> | |
On Windows PowerShell run: | |
```sh | |
$env:CT_CUBLAS=1 | |
pip uninstall ctransformers --yes | |
pip install ctransformers --no-binary ctransformers | |
``` | |
On Windows Command Prompt run: | |
```sh | |
set CT_CUBLAS=1 | |
pip uninstall ctransformers --yes | |
pip install ctransformers --no-binary ctransformers | |
``` | |
</details> | |
### π€ Transformers | |
To enable GPU (CUDA) support for the π€ Transformers model, add the following to your `chatdocs.yml`: | |
```yml | |
huggingface: | |
device: 0 | |
``` | |
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/). | |
### GPTQ | |
To enable GPU (CUDA) support for the GPTQ model, add the following to your `chatdocs.yml`: | |
```yml | |
gptq: | |
device: 0 | |
``` | |
You may have to reinstall PyTorch with CUDA enabled by following the instructions [here](https://pytorch.org/get-started/locally/). | |
After installing PyTorch with CUDA enabled, you should also reinstall the `auto-gptq` package: | |
```sh | |
pip uninstall auto-gptq --yes | |
pip install chatdocs[gptq] | |
``` | |
## License | |
[MIT](https://github.com/marella/chatdocs/blob/main/LICENSE) | |