Collections
Discover the best community collections!
Collections including paper arxiv:2410.13166
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 40 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
LiteSearch: Efficacious Tree Search for LLM
Paper • 2407.00320 • Published • 39 -
Cut Your Losses in Large-Vocabulary Language Models
Paper • 2411.09009 • Published • 47 -
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper • 2411.09595 • Published • 73
-
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Paper • 2404.11912 • Published • 17 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 25 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 259 -
An Evolved Universal Transformer Memory
Paper • 2410.13166 • Published • 3
-
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Paper • 2404.07143 • Published • 107 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 35 -
An Evolved Universal Transformer Memory
Paper • 2410.13166 • Published • 3