s1-0.5B / README.md
2stacks's picture
Update README.md
e19143a verified
metadata
pipeline_tag: text-generation
inference: true
license: apache-2.0
datasets:
  - simplescaling/s1K
base_model:
  - Qwen/Qwen2.5-0.5B-Instruct
library_name: transformers

Model Summary

s1-0.5B is a reasoning model finetuned from Qwen2.5-0.5B-Instruct on just 1,000 examples. This model was created simply to test the process used to train the original S1 cited below using consumer grade GPUs.

Use

The model usage is documented here.

Citation

@misc{muennighoff2025s1simpletesttimescaling,
      title={s1: Simple test-time scaling}, 
      author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto},
      year={2025},
      eprint={2501.19393},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.19393}, 
}