Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ tags:
 # **Phi-4 o1 [ Chain of Thought Reasoning ]**
-phi-4 o1 ft is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small, capable models were trained with data focused on high quality and advanced reasoning.
 phi-4 has adopted a robust safety post-training approach. This approach leverages a variety of both open-source and in-house generated synthetic datasets. The overall technique employed to do the safety alignment is a combination of SFT (Supervised Fine-Tuning) and iterative DPO (Direct Preference Optimization), including publicly available datasets focusing on helpfulness and harmlessness as well as various questions and answers targeted at multiple safety categories.

 # **Phi-4 o1 [ Chain of Thought Reasoning ]**
+[Phi-4 O1 finetuned] from Microsoft's Phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach is to ensure that small, capable models are trained with high-quality data focused on advanced reasoning.
 phi-4 has adopted a robust safety post-training approach. This approach leverages a variety of both open-source and in-house generated synthetic datasets. The overall technique employed to do the safety alignment is a combination of SFT (Supervised Fine-Tuning) and iterative DPO (Direct Preference Optimization), including publicly available datasets focusing on helpfulness and harmlessness as well as various questions and answers targeted at multiple safety categories.