Qwen/Qwen2.5-VL-72B-Instruct-AWQ and Qwen/Qwen2.5-VL-40<B-Instruct-AWQ please
#1
by
devops724
- opened
Hi,
can we have 72b in awq for 48GB vram device
also under 40B version like 32B or 38B in AWQ f
for load in 24GB VRAM devices
8 bit gptq would be nice...maybe fp8?
AWQ or GPTQ-Int4 Int8 would be nice
+1