granite-3.1-8b-instruct-abliterated-exl2

Model: granite-3.1-8b-instruct-abliterated
Made by: huihui-ai
Granite 3 authors: Granite Team, IBM

Quants

4bpw h6 (main)
4.5bpw h6
5bpw h6
6bpw h6
8bpw h8

Quantization notes

Made with exllamav2 0.2.7 with default dataset. This model requires exllamav2 0.2.7 or newer.
Exl2 quants require to be fully loaded into GPU VRAM, RAM offloading isn't supported natively.
Additionally it requires Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
These quants can be used with TabbyAPI or Text-Generation-WebUI.

Original model card

huihui-ai/granite-3.1-8b-instruct-abliterated

This is an uncensored version of ibm-granite/granite-3.1-8b-instruct created with abliteration (see remove-refusals-with-transformers to know more about it).
This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.

Use with ollama

You can use huihui_ai/granite3.1-dense-abliterated directly,

ollama run huihui_ai/granite3.1-dense-abliterated