|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- pl |
|
base_model: |
|
- CYFRAGOVPL/PLLuM-8x7B-nc-instruct |
|
--- |
|
# PLLuM-8x7B-nc-instruct GGUF Quantizations by Nondzu |
|
|
|
DISCLAIMER: This is state of the art quantized model. I am not the author of the original model. I am only hosting the quantized models. I do not take any responsibility for the models. |
|
|
|
This repository contains GGUF quantized versions of the [PLLuM-8x7B-nc-instruct](https://huggingface.co/CYFRAGOVPL/PLLuM-8x7B-nc-instruct) model. All quantizations were performed using the [llama.cpp](https://github.com/ggerganov/llama.cpp) (release [b4768](https://github.com/ggml-org/llama.cpp/releases/tag/b4768)). These quantized models can be run in [LM Studio](https://lmstudio.ai/) or any other llama.cpp–based project. |
|
|
|
## Prompt Format |
|
|
|
Use the following prompt structure: |
|
``` |
|
??? |
|
``` |
|
|
|
## Available Files |
|
|
|
Below is a list of available quantized model files along with their quantization type, file size, and a short description. |
|
|
|
| Filename | Quant Type | File Size | Description | |
|
| ------------------------------------------------------------------------------------- | ---------- | --------- | --------------------------------------------------------------------------------------------- | |
|
| [PLLuM-8x7B-nc-instruct-Q2_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q2_K | 17 GB | Very low quality but surprisingly usable. | |
|
| [PLLuM-8x7B-nc-instruct-Q3_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K | 21 GB | Low quality, suitable for setups with very limited RAM. | |
|
| [PLLuM-8x7B-nc-instruct-Q3_K_L.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_L | 23 GB | High quality; recommended for quality-focused usage. | |
|
| [PLLuM-8x7B-nc-instruct-Q3_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_M | 21 GB | Very high quality, near perfect output – recommended. | |
|
| [PLLuM-8x7B-nc-instruct-Q3_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q3_K_S | 20 GB | Moderate quality with improved space efficiency. | |
|
| [PLLuM-8x7B-nc-instruct-Q4_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q4_K_M | 27 GB | Default quality for most use cases – recommended. | |
|
| [PLLuM-8x7B-nc-instruct-Q4_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q4_K_S | 25 GB | Slightly lower quality with enhanced space savings – recommended when size is a priority. | |
|
| [PLLuM-8x7B-nc-instruct-Q5_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q5_K_M | 31 GB | High quality – recommended. | |
|
| [PLLuM-8x7B-nc-instruct-Q5_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q5_K_S | 31 GB | High quality, offered as an alternative with minimal quality loss. | |
|
| [PLLuM-8x7B-nc-instruct-Q6_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q6_K | 36 GB | Very high quality with quantized embed/output weights. | |
|
| [PLLuM-8x7B-nc-instruct-Q8_0.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) | Q8_0 | 47 GB | Maximum quality quantization. | |
|
|
|
## Downloading Using Hugging Face CLI |
|
|
|
<details> |
|
<summary>Click to view download instructions</summary> |
|
|
|
First, ensure you have the Hugging Face CLI installed: |
|
|
|
```bash |
|
pip install -U "huggingface_hub[cli]" |
|
``` |
|
|
|
Then, target a specific file to download: |
|
|
|
```bash |
|
huggingface-cli download Nondzu/PLLuM-8x7B-instruct-nc-GGUF --include "PLLuM-8x7B-instruct-nc-Q4_K_M.gguf" --local-dir ./ |
|
``` |
|
|
|
For larger files, you can specify a new local directory (e.g., `PLLuM-8x7B-instruct-nc-Q8_0`) or download them directly into the current directory (`./`). |
|
|
|
</details> |