Qwen/Qwen2.5-VL-72B-Instruct-AWQ and Qwen/Qwen2.5-VL-40<B-Instruct-AWQ please

by devops724 - opened Jan 28

Jan 28

Hi,
can we have 72b in awq for 48GB vram device
also under 40B version like 32B or 38B in AWQ f
for load in 24GB VRAM devices

Jan 29

8 bit gptq would be nice...maybe fp8?

Jan 29

AWQ or GPTQ-Int4 Int8 would be nice

Jan 29

Jan 30

Feb 4

GPTQ-Int4 please，very nice on 2080ti 22g for poor people

Feb 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment