Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback Paper ā¢ 2501.10799 ā¢ Published Jan 18 ā¢ 15