Pure quantizations of Mathstral-7B-v0.1 for mistral.java.

In the wild, Q8_0 quantizations are fine, but Q4_0 quantizations are rarely pure e.g. the output.weights tensor is quantized with Q6_K, instead of Q4_0.
A pure Q4_0 quantization can be generated from a high precision (F32, F16, BFLOAT16) .gguf source with the quantize utility from llama.cpp as follows:

./llama-quantize --pure ./Mathstral-7B-v0.1-F32.gguf ./Mathstral-7B-v0.1-Q4_0.gguf Q4_0

Original model: https://huggingface.co/mistralai/mathstral-7B-v0.1

**Note that this model does not support a System prompt.

Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B. You can read more in the official blog post.

Downloads last month
24
GGUF
Model size
7.25B params
Architecture
llama

4-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including mukel/Mathstral-7B-v0.1-GGUF