NovaSkyAI commited on
Commit
fc9e1a0
·
verified ·
1 Parent(s): 104fb81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ Please see our [blog post](https://novasky-ai.github.io/posts/reduce-overthinkin
26
 
27
  ### Training Data
28
 
29
- 12K preference pairs in math and coding domains, generated by Sky-T1-32B-Preview.
30
 
31
  ### Training Procedure
32
  We perform Simple Policy Optimization (SimPO) with a batch size of 96, learning rate of 5e-7, gamma of 0.3, and beta of 2.0.
 
26
 
27
  ### Training Data
28
 
29
+ 10K preference pairs in math and coding domains, generated by Sky-T1-32B-Preview.
30
 
31
  ### Training Procedure
32
  We perform Simple Policy Optimization (SimPO) with a batch size of 96, learning rate of 5e-7, gamma of 0.3, and beta of 2.0.