lcpp_mod.patch
Hi. I too like to make image model quants to use with ComfyUI-GGUF.
It looks like you have been working on quantizing clip_g, any success so far?
I tried your patch but first it wouldn't apply to llama.cpp-b3600 (it has LLM_ARCH_FLUX etc. as context lines at one point instead of as additions), and having fixed that the resulting llama-quantize produces this error when trying to quantize a clip_g:
main: quantizing 'models/gguf/clip_g.gguf' to 'models/gguf/clip_g-Q4_0.gguf' as Q4_0
ggml/src/ggml.c:3728: GGML_ASSERT(n_dims >= 1 && n_dims <= GGML_MAX_DIMS) failed
clip_g.gguf had been successfully created using your convert_g.py and seems a valid file checked with koboldcpp --analyze
Analyzing models/gguf/clip_g.gguf, please wait...
---
*** GGUF FILE METADATA ***
GGUF.version = 3
GGUF.tensor_count = 518
GGUF.kv_count = 2
str: general.architecture = clip_g
u32: general.file_type = 1
*** GGUF TENSOR INFO ***
0 : f32 | logit_scale | []
...
517: f16 | text_projection.weight | [1280, 1280]
Metadata and TensorInfo Bytes: 42705
---
But logit_scale doesn't have dimensions?
Didn't mean to close the discussion... I'm new here :)
Anyway, I tried using clip_g.gguf and it looks like the ComfyUI GGUF clip loader custom node doesn't even support the clip_g architecture yet.
To quantize clip_g, I need to modify the lcpp.patch and add a custom node or modify the GGUF custom nodes. Unfortunately, I am testing T5 with SDXL at the moment, and need to come around to this at a later point.