LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention Paper • 2502.08213 • Published 29 days ago • 4 • 2