int4/8 quantization so that we can deploy on consumer-grid gpu card

by Yhyu13 - opened May 7, 2023

Yhyu13

May 7, 2023

Hi, would you like to release the quantized version of glm 10B, this would allow to run on a 16GB card which is great

Jun 19, 2023

Hi, you can check out my PR which allows for in8 quantization. I haven't tested for in4.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment