TheDrummer's GLM Steam quantized using ik_llama.cpp
first attempt at quantizing something "on my own"
ive tried using both bartowski's and mradermacher's imatrix files but wasn't able to use any of them and had to make one myself from the guide (skill issue)
this quant requires ik_llama.cpp fork to work properly
followed ubergarm's quant cookers basic guide but since i had no idea what i was doing i just copied his recipes and applied it on TheDrummer's model
also used general calibration data instead of rp focused so performance may suffer a bit
feel free to roast me if i messed something up (which i certainly did)
- Downloads last month
- 21
Hardware compatibility
Log In
to view the estimation
2-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support