README.md · SebastianSchramm/LlamaGuard-7b-GPTQ-4bit-128g-actorder

disable inference api since it is currently not working

02d0223 over 1 year ago

820 Bytes

	---
	license: llama2
	language:
	- en
	library_name: transformers
	tags:
	- facebook
	- meta
	- pytorch
	- llama
	- llama-2
	- 4bit
	- gptq
	base_model: meta-llama/LlamaGuard-7b
	inference: false
	---

	# Quantized version of meta-llama/LlamaGuard-7b

	## Model Description

	The model [meta-llama/LlamaGuard-7b](https://huggingface.co/meta-llama/LlamaGuard-7b) was quantized to 4bit, group_size 128, and act-order=True with auto-gptq integration in transformers (https://huggingface.co/blog/gptq-integration).

	## Evaluation

	To evaluate the qunatized model and compare it with the full precision model, I performed binary classification on the "toxicity" label from the ~5k samples test set of lmsys/toxic-chat.

	📊 Full Precision Model:

	Average Precision Score: 0.3625

	📊 4-bit Quantized Model:

	Average Precision Score: 0.3450