https://huggingface.co/GAIR/LIMO
I found another interesting model (thank you for quantizing prior one :).
As name suggests, LIMO follows "less is more" principle, more specifically...
LIMO challenges the conventional wisdom in mathematical reasoning by demonstrating that models can achieve superior performance with significantly less but higher quality training data.
I think that quantizing LIMO to GGUFs is reasonable, for people's convenience (e.g. to verify/use this model).
LIMO probably don't need software modifications to quantize and run, since authors wrote the following about it...
Our model is fine-tuned on Qwen2.5-32B-Instruct and is compatible with most mainstream frameworks like HF Transformers, VLLM, TensorRT-LLM and etc.
Thanks for your short summary, that makes it so much more interesting (haha, what a cheesy model name, also :). It's queued, as usual, and you can check on its status at http://hf.tst.eu/status.html
Don't feel compelled to make a case for each model request though, at least right now, we'll just quantiize more or less blindly when something is requested - still, your explanation is appreciated :)