File size: 2,732 Bytes
ae74f0d 2d276c8 39d1d63 439809d 39d1d63 d9948a7 d89d22d 835d90b 2d276c8 39d1d63 2d276c8 6f9971b 2d276c8 6f9971b 2d276c8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
# **PCL-Reasoner-V1**
## Model Overview
We release PCL-Reasoner-V1, a model trained based on Qwen2.5-32B-Base and undergoes high-performance supervised fine-tuning based on the MindSpore framework and Ascend hardware. After fine-tuning, the model demonstrates significant improvements in mathematical reasoning capabilities. PCL-Reasoner-V1 achieves 85.7% and 84.2% respectively on AIME 24 and AIME 25, which position PCL-Reasoner-V1 among the top-tier models in the 32B parameter class on AIME24/25.

We have fully open-sourced the model weights, dataset and training code. Follow the tutorial below to deploy and explore post-training!
## Code
https://github.com/PCL-Reasoner/V1
https://openi.pcl.ac.cn/PCL-Reasoner/V1
## Evaluation
We used the **Avg@32 metric** (averaging 32 sampling attempts per query) for evaluation.
<table>
<tr>
<th>Parameter Size</th>
<th>Model Name</th>
<th>AIME 24</th>
<th>AIME 25</th>
</tr>
<!-- 合并行表头 >100B -->
<tr>
<th rowspan="6">>100B</th>
</tr>
<!-- >100B 组数据行 -->
<tr>
<td>DeepSeek-R1</td>
<td><span style="color:grey">79.8</span></td>
<td><span style="color:grey">70</span></td>
</tr>
<tr>
<td>DeepSeek-R1-0528</td>
<td><span style="color:grey">91.4</span></td>
<td><span style="color:grey">87.5</span></td>
</tr>
<tr>
<td>Qwen3-235B-A22B</td>
<td><span style="color:grey">85.7</span></td>
<td><span style="color:grey">81.5</span></td>
</tr>
<tr>
<td>OpenAI-o3</td>
<td><b>91.6</b></td>
<td><b>88.9</b></td>
</tr>
<tr>
<td>Gemini-2.5-Pro-0506</td>
<td><span style="color:grey">90.8</span></td>
<td><span style="color:grey">83</span></td>
</tr>
<!-- 合并行表头 32B -->
<tr>
<th rowspan="7">32B</th>
</tr>
<!-- 32B 组数据行 -->
<tr>
<td>Qwen3-32B</td>
<td><span style="color:grey">81.4</span></td>
<td><span style="color:grey">72.9</span></td>
</tr>
<tr>
<td>QwQ-32B</td>
<td><span style="color:grey">79.5</span></td>
<td><span style="color:grey">69.5</span></td>
</tr>
<tr>
<td>DeepSeek-R1-Distill-Qwen-32B</td>
<td><span style="color:grey">72.6</span></td>
<td><span style="color:grey">49.6</span></td>
</tr>
<tr>
<td>Skywork-OR1-32B</td>
<td><span style="color:grey">82.2</span></td>
<td><span style="color:grey">73.3</span></td>
</tr>
<tr>
<td>AM-Thinking-v1</td>
<td><span style="color:grey">85.3</span></td>
<td><span style="color:grey">74.4</span></td>
</tr>
<tr>
<td>PCL-Reasoner-v1</td>
<td><b>85.7</b></td>
<td><b>84.2</b></td>
</tr>
</table>
|