Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding Paper • 2502.05609 • Published Feb 8 • 18 • 3