StreamAdapter: Efficient Test Time Adaptation from Contextual Streams Paper • 2411.09289 • Published Nov 14, 2024
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 10 days ago • 72
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 44
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models Paper • 2309.09958 • Published Sep 18, 2023 • 19
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation Paper • 2305.09515 • Published May 16, 2023 • 3