Nondzu's picture
Update README.md
bc8e89b verified
|
raw
history blame
4.44 kB
---
license: cc-by-nc-4.0
language:
- pl
base_model:
- CYFRAGOVPL/PLLuM-8x7B-nc-instruct
---
# PLLuM-8x7B-nc-instruct GGUF Quantizations by Nondzu
DISCLAIMER: This is state of the art quantized model. I am not the author of the original model. I am only hosting the quantized models. I do not take any responsibility for the models.
This repository contains GGUF quantized versions of the [PLLuM-8x7B-nc-instruct](https://huggingface.co/CYFRAGOVPL/PLLuM-8x7B-nc-instruct) model. All quantizations were performed using the [llama.cpp](https://github.com/ggerganov/llama.cpp) (release [b4768](https://github.com/ggml-org/llama.cpp/releases/tag/b4768)). These quantized models can be run in [LM Studio](https://lmstudio.ai/) or any other llama.cpp–based project.
## Prompt Format
Use the following prompt structure:
```
???
```
## Available Files
Below is a list of available quantized model files along with their quantization type, file size, and a short description.
| Filename | Quant Type | File Size | Description |
| ------------------------------------------------------------------------------------- | ---------- | --------- | --------------------------------------------------------------------------------------------- |
| [PLLuM-8x7B-nc-instruct-Q2_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q2_K | 17 GB | Very low quality but surprisingly usable. |
| [PLLuM-8x7B-nc-instruct-Q3_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K | 21 GB | Low quality, suitable for setups with very limited RAM. |
| [PLLuM-8x7B-nc-instruct-Q3_K_L.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_L | 23 GB | High quality; recommended for quality-focused usage. |
| [PLLuM-8x7B-nc-instruct-Q3_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_M | 21 GB | Very high quality, near perfect output – recommended. |
| [PLLuM-8x7B-nc-instruct-Q3_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_S | 20 GB | Moderate quality with improved space efficiency. |
| [PLLuM-8x7B-nc-instruct-Q4_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q4_K_M | 27 GB | Default quality for most use cases – recommended. |
| [PLLuM-8x7B-nc-instruct-Q4_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q4_K_S | 25 GB | Slightly lower quality with enhanced space savings – recommended when size is a priority. |
| [PLLuM-8x7B-nc-instruct-Q5_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q5_K_M | 31 GB | High quality – recommended. |
| [PLLuM-8x7B-nc-instruct-Q5_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q5_K_S | 31 GB | High quality, offered as an alternative with minimal quality loss. |
| [PLLuM-8x7B-nc-instruct-Q6_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q6_K | 36 GB | Very high quality with quantized embed/output weights. |
| [PLLuM-8x7B-nc-instruct-Q8_0.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q8_0 | 47 GB | Maximum quality quantization. |
## Downloading Using Hugging Face CLI
<details>
<summary>Click to view download instructions</summary>
First, ensure you have the Hugging Face CLI installed:
```bash
pip install -U "huggingface_hub[cli]"
```
Then, target a specific file to download:
```bash
huggingface-cli download Nondzu/PLLuM-8x7B-instruct-nc-GGUF --include "PLLuM-8x7B-instruct-nc-Q4_K_M.gguf" --local-dir ./
```
For larger files, you can specify a new local directory (e.g., `PLLuM-8x7B-instruct-nc-Q8_0`) or download them directly into the current directory (`./`).
</details>