Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published May 16, 2025 • 48
Model Merging in Pre-training of Large Language Models Paper • 2505.12082 • Published May 17, 2025 • 40