NovaSky-AI
/

Sky-T1-32B-Flash

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NovaSkyAI commited on 8 days ago

Commit

fc9e1a0

·

verified ·

1 Parent(s): 104fb81

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ Please see our [blog post](https://novasky-ai.github.io/posts/reduce-overthinkin
 ### Training Data
-12K preference pairs in math and coding domains, generated by Sky-T1-32B-Preview.
 ### Training Procedure
 We perform Simple Policy Optimization (SimPO) with a batch size of 96, learning rate of 5e-7, gamma of 0.3, and beta of 2.0.

 ### Training Data
+10K preference pairs in math and coding domains, generated by Sky-T1-32B-Preview.
 ### Training Procedure
 We perform Simple Policy Optimization (SimPO) with a batch size of 96, learning rate of 5e-7, gamma of 0.3, and beta of 2.0.