int4/8 quantization so that we can deploy on consumer-grid gpu card
#7
by
Yhyu13
- opened
Hi, would you like to release the quantized version of glm 10B, this would allow to run on a 16GB card which is great
Hi, would you like to release the quantized version of glm 10B, this would allow to run on a 16GB card which is great