Update README.md

bc8e89b verified 1 day ago

4.44 kB

	---
	license: cc-by-nc-4.0
	language:
	- pl
	base_model:
	- CYFRAGOVPL/PLLuM-8x7B-nc-instruct
	---
	# PLLuM-8x7B-nc-instruct GGUF Quantizations by Nondzu

	DISCLAIMER: This is state of the art quantized model. I am not the author of the original model. I am only hosting the quantized models. I do not take any responsibility for the models.

	This repository contains GGUF quantized versions of the [PLLuM-8x7B-nc-instruct](https://huggingface.co/CYFRAGOVPL/PLLuM-8x7B-nc-instruct) model. All quantizations were performed using the [llama.cpp](https://github.com/ggerganov/llama.cpp) (release [b4768](https://github.com/ggml-org/llama.cpp/releases/tag/b4768)). These quantized models can be run in [LM Studio](https://lmstudio.ai/) or any other llama.cpp–based project.

	## Prompt Format

	Use the following prompt structure:
	```
	???
	```

	## Available Files

	Below is a list of available quantized model files along with their quantization type, file size, and a short description.

	\| Filename \| Quant Type \| File Size \| Description \|
	\| ------------------------------------------------------------------------------------- \| ---------- \| --------- \| --------------------------------------------------------------------------------------------- \|
	\| [PLLuM-8x7B-nc-instruct-Q2_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q2_K \| 17 GB \| Very low quality but surprisingly usable. \|
	\| [PLLuM-8x7B-nc-instruct-Q3_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q3_K \| 21 GB \| Low quality, suitable for setups with very limited RAM. \|
	\| [PLLuM-8x7B-nc-instruct-Q3_K_L.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q3_K_L \| 23 GB \| High quality; recommended for quality-focused usage. \|
	\| [PLLuM-8x7B-nc-instruct-Q3_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q3_K_M \| 21 GB \| Very high quality, near perfect output – recommended. \|
	\| [PLLuM-8x7B-nc-instruct-Q3_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q3_K_S \| 20 GB \| Moderate quality with improved space efficiency. \|
	\| [PLLuM-8x7B-nc-instruct-Q4_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q4_K_M \| 27 GB \| Default quality for most use cases – recommended. \|
	\| [PLLuM-8x7B-nc-instruct-Q4_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q4_K_S \| 25 GB \| Slightly lower quality with enhanced space savings – recommended when size is a priority. \|
	\| [PLLuM-8x7B-nc-instruct-Q5_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q5_K_M \| 31 GB \| High quality – recommended. \|
	\| [PLLuM-8x7B-nc-instruct-Q5_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q5_K_S \| 31 GB \| High quality, offered as an alternative with minimal quality loss. \|
	\| [PLLuM-8x7B-nc-instruct-Q6_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q6_K \| 36 GB \| Very high quality with quantized embed/output weights. \|
	\| [PLLuM-8x7B-nc-instruct-Q8_0.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main) \| Q8_0 \| 47 GB \| Maximum quality quantization. \|

	## Downloading Using Hugging Face CLI

	<details>
	<summary>Click to view download instructions</summary>

	First, ensure you have the Hugging Face CLI installed:

	```bash
	pip install -U "huggingface_hub[cli]"
	```

	Then, target a specific file to download:

	```bash
	huggingface-cli download Nondzu/PLLuM-8x7B-instruct-nc-GGUF --include "PLLuM-8x7B-instruct-nc-Q4_K_M.gguf" --local-dir ./
	```

	For larger files, you can specify a new local directory (e.g., `PLLuM-8x7B-instruct-nc-Q8_0`) or download them directly into the current directory (`./`).

	</details>