Efficient Attention Mechanisms for Large Language Models: A Survey Paper • 2507.19595 • Published Jul 25 • 6
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling Paper • 2502.14856 • Published Feb 20 • 8