-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 55 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 186 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 49 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 610
cloudhan
cloudhan
·
AI & ML interests
None yet
Recent Activity
liked
a model
7 days ago
Qwen/QwQ-32B
liked
a model
21 days ago
LatitudeGames/Wayfarer-Large-70B-Llama-3.3-GGUF
liked
a model
21 days ago
google/paligemma2-28b-mix-448
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet