brittlewis12
/

s1.1-32B-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

s1.1-32B-GGUF / README.md

brittlewis12's picture

Create README.md

c4e4a82 verified about 1 month ago

|

history blame contribute delete

3.29 kB

	---
	base_model: simplescaling/s1.1-32B
	pipeline_tag: text-generation
	inference: true
	language:
	- en
	license: apache-2.0
	model_creator: simplescaling
	model_name: s1.1-32B
	model_type: qwen2
	datasets:
	- simplescaling/s1K
	quantized_by: brittlewis12

	---

	# s1.1 32B GGUF

	Original model: [s1.1 32B](https://huggingface.co/simplescaling/s1.1-32B)

	Model creator: [simplescaling](https://huggingface.co/simplescaling)

	> s1.1 is our sucessor of s1 with better reasoning performance by leveraging reasoning traces from r1 instead of Gemini.

	> s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

	This repo contains GGUF format model files for simplescaling’s s1.1 32B, an open reproduction of OpenAI’s o1-preview including model, source code, and data (see [s1K](https://huggingface.co/datasets/simplescaling/s1K)).

	Learn more on simplescaling’s [s1 github repo](https://github.com/simplescaling/s1), [arxiv preprint](https://arxiv.org/abs/2501.19393), and on [twitter](https://twitter.com/Muennighoff/status/1889310803746246694).

	### What is GGUF?

	GGUF is a file format for representing AI models. It is the third version of the format,
	introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
	Converted with llama.cpp build 4689 (revision [90e4dba](https://github.com/ggerganov/llama.cpp/commits/90e4dba461b07e635fd1daf3b491c978c7dd0013)),
	using [autogguf-rs](https://github.com/brittlewis12/autogguf-rs).

	### Prompt template: ChatML

	```
	<\|im_start\|>system
	{{system_message}}<\|im_end\|>
	<\|im_start\|>user
	{{prompt}}<\|im_end\|>
	<\|im_start\|>assistant

	```

	---

	## Download & run with [cnvrs](https://twitter.com/cnvrsai) on iPhone, iPad, and Mac!

	![cnvrs.ai](https://pbs.twimg.com/profile_images/1744049151241797632/0mIP-P9e_400x400.jpg)

	[cnvrs](https://testflight.apple.com/join/sFWReS7K) is the best app for private, local AI on your device:
	- create & save Characters with custom system prompts & temperature settings
	- download and experiment with any GGUF model you can [find on HuggingFace](https://huggingface.co/models?library=gguf)!
	* or, use an API key with the chat completions-compatible model provider of your choice -- ChatGPT, Claude, Gemini, DeepSeek, & more!
	- make it your own with custom Theme colors
	- powered by Metal ⚡️ & [Llama.cpp](https://github.com/ggerganov/llama.cpp), with haptics during response streaming!
	- try it out yourself today, on [Testflight](https://testflight.apple.com/join/sFWReS7K)!
	- follow [cnvrs on twitter](https://twitter.com/cnvrsai) to stay up to date

	---

	## Original Model Evaluation

	> Note that s1-32B and s1.1-32B use budget forcing in this table; specifically ignoring end-of-thinking and appending "Wait" once or twice.

	\| Metric \| s1-32B \| s1.1-32B \| o1-preview \| o1 \| DeepSeek-R1 \| DeepSeek-R1-Distill-Qwen-32B \|
	\|---\|---\|---\|---\|---\|---\|---\|
	\| # examples \| 1K \| 1K \| ? \| ? \| >800K \| 800K \|
	\| AIME2024 \| 56.7 \| 56.7 \| 40.0 \| 74.4 \| 79.8 \| 72.6 \|
	\| AIME2025 I \| 26.7 \| 60.0 \| 37.5 \| ? \| 65.0 \| 46.1 \|
	\| MATH500 \| 93.0 \| 95.4 \| 81.4 \| 94.8 \| 97.3 \| 94.3 \|
	\| GPQA-Diamond \| 59.6 \| 63.6 \| 75.2 \| 77.3 \| 71.5 \| 62.1 \|