Pinkstack
/

Superthoughts-lite-v1-GGUF

@@ -1,22 +1,84 @@
 ---
-base_model: Pinkstack/Superthoughts-lite-1.8B-sft
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - gguf
 license: apache-2.0
 language:
 - en
 ---
 # Uploaded  model
 - **Developed by:** Pinkstack
 - **License:** apache-2.0
-- **Finetuned from model :** Pinkstack/Superthoughts-lite-1.8B-sft
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+base_model:
+- Pinkstack/Superthoughts-lite-v1
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - gguf
+- cot
+- superthoughts
+- reasoning
+- grpo
 license: apache-2.0
 language:
 - en
+datasets:
+- openai/gsm8k
+- Pinkstack/intructions-sft-sharegpt
 ---
+![superthoughts lite](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/K5kYIHYj2aX2kB6MlcM9O.png)
+# Information
+Advanced, high-quality and lite reasoning for a tiny size that you can run on your phone.
+Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned in on reasoning & modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
+# Which quant is right for you?
+***F16***: Least hallucinations, high-quality reasoning yet heavy to run.
+***Q8_0***: Limited amount of hallucinations high-quality reasoning, recommended
+***Q6_k***: Hallucinates more, good reasoning but may fail at counting etc. only use if you cannot run Q8_0.
+***Q4_k_m***: Not recommended, Hallucinates, doesn't always think properly. easier to run though.
+# Format
+```
+<|im_start|>user
+How many R's in strawberry<|im_end|>
+<|im_start|>assistant
+<think>
+Alright, the user has asked how many R's in the word strawberry, that's easy! I just need to count each instance of the letter 'R' in the word 's-t-r-a-w-b-e-r-r-y' and then find out how many R's there are, lets count!
+S - Not an R,
+T - Not an R,
+R - First instance of the letter R! (1),
+A - Not an R,
+W - Not an R,
+B - Not an R,
+E - Not an R,
+R - Great! Second instance of the letter R. (2),
+R - Third instance of the letter R. (3),
+Y - Not an R.
+So, i've counted all the letters correctly, meaning that I am sure that there are 3 R's in the word Strawberry. I should probably let the user know.
+</think>
+<output>3
+</output><|im_end|>
+```
+# system prompt
+(important to ensure it would always think, output).
+```
+respond in the following format:
+<think>
+...
+</think>
+<output>
+...
+</output>
+```
+# Examples:
+all responses below generated with our system prompt and a temperature of 0.7.
+Generated inside the android application, ChatterUI via GGUF Q8, using the model's prompt format. and our
+1)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/5veZJmkjuv_7W7pKhvsu0.png)
+2)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/pAwPdVkEZ7rnFf-TZ5tMU.png)
+3)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/FDaWAAqgv2kvoZvjl8gjl.png)
 # Uploaded  model
 - **Developed by:** Pinkstack
 - **License:** apache-2.0
+- **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct