ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published Oct 28, 2024 • 11 • 2
Fast Best-of-N Decoding via Speculative Rejection Paper • 2410.20290 • Published Oct 26, 2024 • 10 • 2