NexesQuants/MC_Llama3.3-70B-Nemotron3.1-Eva0.1-ocypetp-bf16-iMat-CQ-GGUF

GGUF quant(s) for this model: https://huggingface.co/mergekit-community/mergekit-dare_ties-ocypetp

The first quant published uses another Imatrix (Bartowski's iMatrix for Nevoria 70b, first of the name. But the model is good already for RP. Very good, actually.

Edit : I made an iMatix for this model, and requanted-> v2).

Sadly (or hopefully, it means that the iMatrix of a given model can be quite useful to quantize another with quite a good approximation) a dedicated imatrix doesn't fix the high perplexity. (+0.35-0.5 compared to most of L3.1/3 merges). I guess I didn't pick the most suitable merge, perplexity wise. The q5_0 quant is a direct conversion from the bf16 weights, converted with Q8_0 embeddings, output weight, and attn_v, the rest in Q5_0, and I used that quant to make the iMatrix.

Edit again : After testing, the high perplexity of this merge (4.5 for wikitext 512ctx instead of 3.9-4.1) is, at least partly, due to Eva 0.1 itself (which has +0.3 ppl wiki 512ctx compared to this merge). That means, considering its prose, that this merge retained the qualities of Eva 0.1 with a lower ppl, and the qualities of Nemotron 3.1 instruct as well, added to those of Llama 3.3 instruct used as a base, both in their non abliterated versions.

All things considered, that's satisfactory.