Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 24 days ago • 142
Toward Adaptive Reasoning in Large Language Models with Thought Rollback Paper • 2412.19707 • Published Dec 27, 2024 • 1
Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models Paper • 2402.11140 • Published Feb 17, 2024 • 1