GGUF
Inference Endpoints

These are GGUF quants of Sappha-2b-v3. The original model card is below:

sappha-2b-v3

a slightly less experimental qlora instruct finetune of the gemma-2b base model. trained with unsloth.

benchmarks

gemma-2b-it sappha-2b-v3 dolphin-2.8-gemma-2b
MMLU (five-shot) 36.98 38.02 37.89
HellaSwag (zero-shot) 49.22 51.70 47.79
PIQA (one-shot) 75.08 75.46 71.16
TruthfulQA (zero-shot) 37.51 31.65 37.15

prompt format

basic chatml:

<|im_start|>system
You are a useful and helpful AI assistant.<|im_end|>
<|im_start|>user
what are LLMs?<|im_end|>
<|im_start|>assistant
LLMs, or Large Language Models, are advanced artificial intelligence systems that can perform tasks similar to human language. They are trained on vast amounts of data and can understand and respond to human queries. LLMs are often used in various applications, such as language translation, text generation, and question answering.<|im_end|>

quants

gguf: https://huggingface.co/Fizzarolli/sappha-2b-v3-GGUF

what happened to v2?

it was a private failure :)

Downloads last month
30
GGUF
Model size
2.51B params
Architecture
gemma

3-bit

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Datasets used to train Fizzarolli/sappha-2b-v3-GGUF