How to quantify?

#1
by xunmi1 - opened

Dear author, I would like to ask you how to use AutoGPTQ to quantify Qwen2.5VL?

Hi @xunmi1 , Qwen2.5 official support AWQ quantization. You may find more details from the following links.
[Qwen2.5-VL-3B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct-AWQ
[Qwen2.5-VL-7B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct-AWQ
[Qwen2.5-VL-72B-Instruct-AWQ] https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct-AWQ

Dear author,

I am trying to quantize the Qwen2.5-VL-Instruct 7B model using GPTQ, but I haven't been successful.
I encountered the following error:

"Traceback (most recent call last): File "/app/quantilize_qwen_vl_gqtq_example.py", line 33, in model = GPTQModel.load(model_id, quant_config, modules_not_to_quantize=["vision_encoder"], model_type="qwen") File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 247, in load return cls.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 275, in from_pretrained model_type = check_and_get_model_type(model_id_or_path, trust_remote_code) File "/usr/local/lib/python3.10/dist-packages/gptqmodel/models/auto.py", line 184, in check_and_get_model_type raise TypeError(f"{config.model_type} isn't supported yet.") TypeError: qwen2_5_vl isn't supported yet."

Since I found your repository on Hugging Face, I was wondering if you could kindly provide guidance on how to successfully quantize this model.
I would really appreciate your help.

Thank you in advance!

Best regards,
Sean

P.S Following is my code:

from datasets import load_dataset
from gptqmodel import GPTQModel, QuantizeConfig
from transformers import AutoProcessor, AutoModelForVision2Seq
from PIL import Image
from gptqmodel.models.auto import SUPPORTED_MODELS

model_id = "huggingface/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/68156fd997cdc9f710620466735af49862bb81f6"
quant_path = "Qwen2.5-VL-gptqmodel-4bit"
calibration_dataset = load_dataset(
"laion/laion2B-multi",
split="train",
streaming=True
).take(1024)
calibration_dataset = load_dataset("imagefolder", data_dir="/app/calibra_dataset")

quant_config = QuantizeConfig(
    bits=4, 
    group_size=128, 
)
model = GPTQModel.load(model_id, quant_config
                       , modules_not_to_quantize=["vision_encoder"]
                       , model_type="qwen")
model.quantize(calibration_dataset, batch_size=2)
model.save(quant_path)
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment