File size: 2,732 Bytes
ae74f0d
2d276c8
39d1d63
439809d
39d1d63
 
 
d9948a7
 
d89d22d
 
 
 
 
835d90b
 
2d276c8
 
 
 
39d1d63
 
2d276c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6f9971b
 
2d276c8
 
 
 
 
 
 
 
 
 
 
 
 
6f9971b
2d276c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# ​**PCL-Reasoner-V1**

## Model Overview  
We release ​PCL-Reasoner-V1​, a model trained based on ​Qwen2.5-32B-Base​ and undergoes high-performance supervised fine-tuning based on the ​MindSpore framework​ and ​Ascend hardware. After fine-tuning, the model demonstrates significant improvements in mathematical reasoning capabilities. PCL-Reasoner-V1 achieves 85.7% and 84.2% respectively on AIME 24 and AIME 25, which position PCL-Reasoner-V1 among the top-tier models in the 32B parameter class on AIME24/25.

![eval_results](images/README/eval_results.png)  

We have fully open-sourced the model weights, dataset and training code. Follow the tutorial below to deploy and explore post-training!

## Code
https://github.com/PCL-Reasoner/V1

https://openi.pcl.ac.cn/PCL-Reasoner/V1

## Evaluation
We used the ​**Avg@32 metric**​ (averaging 32 sampling attempts per query) for evaluation.


<table>
  <tr>
    <th>Parameter Size</th>
    <th>Model Name</th>
    <th>AIME 24</th>
    <th>AIME 25</th>
  </tr>
  <!-- 合并行表头 >100B -->
  <tr>
    <th rowspan="6">&gt;100B</th>
  </tr>
  <!-- >100B 组数据行 -->
  <tr>
    <td>DeepSeek-R1</td>
    <td><span style="color:grey">79.8</span></td>
    <td><span style="color:grey">70</span></td>
  </tr>
  <tr>
    <td>DeepSeek-R1-0528</td>
    <td><span style="color:grey">91.4</span></td>
    <td><span style="color:grey">87.5</span></td>
  </tr>
  <tr>
    <td>Qwen3-235B-A22B</td>
    <td><span style="color:grey">85.7</span></td>
    <td><span style="color:grey">81.5</span></td>
  </tr>
  <tr>
    <td>OpenAI-o3</td>
    <td><b>91.6</b></td>
    <td><b>88.9</b></td>
  </tr>
  <tr>
    <td>Gemini-2.5-Pro-0506</td>
    <td><span style="color:grey">90.8</span></td>
    <td><span style="color:grey">83</span></td>
  </tr>
  <!-- 合并行表头 32B -->
  <tr>
    <th rowspan="7">32B</th>
  </tr>
  <!-- 32B 组数据行 -->
  <tr>
    <td>Qwen3-32B</td>
    <td><span style="color:grey">81.4</span></td>
    <td><span style="color:grey">72.9</span></td>
  </tr>
  <tr>
    <td>QwQ-32B</td>
    <td><span style="color:grey">79.5</span></td> 
    <td><span style="color:grey">69.5</span></td>
  </tr>
  <tr>
    <td>DeepSeek-R1-Distill-Qwen-32B</td>
    <td><span style="color:grey">72.6</span></td>
    <td><span style="color:grey">49.6</span></td> 
  </tr>
  <tr>
    <td>Skywork-OR1-32B</td>
    <td><span style="color:grey">82.2</span></td>
    <td><span style="color:grey">73.3</span></td>
  </tr>
  <tr>
    <td>AM-Thinking-v1</td>
    <td><span style="color:grey">85.3</span></td>
    <td><span style="color:grey">74.4</span></td>
  </tr>
  <tr>
    <td>PCL-Reasoner-v1</td>
    <td><b>85.7</b></td>
    <td><b>84.2</b></td>
  </tr>
</table>