Safetensors
qwen2
yuchenFan commited on
Commit
fe98fad
·
verified ·
1 Parent(s): 7f2b61d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -130,7 +130,9 @@ We use codes in [Implicit PRM](https://github.com/PRIME-RL/ImplicitPRM/tree/main
130
 
131
  ### Evaluation Base Model
132
 
133
- We adopt **Eurus-2-7B-SFT**, **Qwen2.5-7B-Instruct** and **Llama-3.1-70B-Instruct** as generation models to evaluate the performance of our implicit PRM. For all models, we set the sampling temperature as 0.5, *p* of the top-*p* sampling as 1.
 
 
134
 
135
  ### Best-of-N Sampling
136
 
 
130
 
131
  ### Evaluation Base Model
132
 
133
+ For **Best-of N Sampling**, we adopt **Eurus-2-7B-SFT**, **Qwen2.5-7B-Instruct** and **Llama-3.1-70B-Instruct** as generation models to evaluate the performance of our implicit PRM. For all models, we set the sampling temperature as 0.5, *p* of the top-*p* sampling as 1.
134
+
135
+ For **ProcessBench**, we adopt **Math-Shepherd-PRM-7B**, **RLHFlow-PRM-Mistral-8B**, **RLHFlow-PRM-Deepseek-8B**, **Skywork-PRM-7B**, **EurusPRM-Stage 1**, and **EurusPRM-Stage 2**.
136
 
137
  ### Best-of-N Sampling
138