OpenSeek
Collection
OpenSeek
•
11 items
•
Updated
•
5
We adopt the Octothinker to build strong reasoning foundations. Our model's training consists of two phases: a mid-training stable phase on 200 billion tokens from a mathematical corpus, followed by a 20 billion token decay phase. Subsequently, we fine-tune the model on the Infinity-Instruct dataset to achieve superior instruction-following capabilities. This model is open-sourced as a baseline for future experiments, such as enhancing the reasoning capabilities of small models through reinforcement learning. The model architecture is the same as the OpenSeek-Small-v1 model.
Metric | GSM8K | MATH-500 | Minerva Math | OlympiadBench | Avg. |
---|---|---|---|---|---|
Pass@1 | 20.698 | 13.100 | 3.470 | 2.741 | 10.002 |
Pass@4 | 41.768 | 19.100 | 8.415 | 4.997 | 18.570 |
Pass@8 | 51.838 | 19.599 | 11.680 | 5.185 | 22.075 |
Base model
BAAI/OpenSeek-Small-v1