Qwen/Qwen2.5-VL-72B-Instruct-AWQ and Qwen/Qwen2.5-VL-40<B-Instruct-AWQ please

#1
by devops724 - opened

Hi,
can we have 72b in awq for 48GB vram device
also under 40B version like 32B or 38B in AWQ f
for load in 24GB VRAM devices

8 bit gptq would be nice...maybe fp8?

AWQ or GPTQ-Int4 Int8 would be nice

Sign up or log in to comment