How to quantify?
Dear author, I would like to ask you how to use AutoGPTQ to quantify Qwen2.5VL?
Hi
@xunmi1
, Qwen2.5 official support AWQ quantization. You may find more details from the following links.
[Qwen2.5-VL-3B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct-AWQ
[Qwen2.5-VL-7B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct-AWQ
[Qwen2.5-VL-72B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct-AWQ
Dear author,
I am trying to quantize the Qwen2.5-VL-Instruct 7B model using GPTQ, but I haven't been successful.
I encountered the following error:
"Traceback (most recent call last): File "/app/quantilize_qwen_vl_gqtq_example.py", line 33, in model = GPTQModel.load(model_id, quant_config, modules_not_to_quantize=["vision_encoder"], model_type="qwen") File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 247, in load return cls.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 275, in from_pretrained model_type = check_and_get_model_type(model_id_or_path, trust_remote_code) File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 184, in check_and_get_model_type raise TypeError(f"{config.model_type} isn't supported yet.") TypeError: qwen2_5_vl isn't supported yet."
Since I found your repository on Hugging Face, I was wondering if you could kindly provide guidance on how to successfully quantize this model.
I would really appreciate your help.
Thank you in advance!
Best regards,
Sean
P.S Following is my code:
from datasets import load_dataset
from gptqmodel import GPTQModel, QuantizeConfig
from transformers import AutoProcessor, AutoModelForVision2Seq
from PIL import Image
from gptqmodel.models.auto import SUPPORTED_MODELS
model_id = "huggingface/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/68156fd997cdc9f710620466735af49862bb81f6"
quant_path = "Qwen2.5-VL-gptqmodel-4bit"
calibration_dataset = load_dataset(
"laion/laion2B-multi",
split="train",
streaming=True
).take(1024)
calibration_dataset = load_dataset("imagefolder", data_dir="/app/calibra_dataset")
quant_config = QuantizeConfig(
bits=4,
group_size=128,
)
model = GPTQModel.load(model_id, quant_config
, modules_not_to_quantize=["vision_encoder"]
, model_type="qwen")
model.quantize(calibration_dataset, batch_size=2)
model.save(quant_path)