Collections
Discover the best community collections!
Collections including paper arxiv:2402.00838
-
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 19 -
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Paper • 2401.15688 • Published • 11 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 72 -
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities
Paper • 2401.15071 • Published • 37
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 83 -
Efficient Exploration for LLMs
Paper • 2402.00396 • Published • 22 -
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
Transforming and Combining Rewards for Aligning Large Language Models
Paper • 2402.00742 • Published • 12
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 83 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 27 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 65 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 14
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 53 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 19 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 19 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 6
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 52 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 83 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 27
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Attention Is All You Need
Paper • 1706.03762 • Published • 50 -
You Only Look Once: Unified, Real-Time Object Detection
Paper • 1506.02640 • Published • 1 -
HEp-2 Cell Image Classification with Deep Convolutional Neural Networks
Paper • 1504.02531 • Published -
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Paper • 2401.05566 • Published • 28
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 92 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 72 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 49
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 43 -
Qwen Technical Report
Paper • 2309.16609 • Published • 35 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 5 -
Gemini: A Family of Highly Capable Multimodal Models
Paper • 2312.11805 • Published • 44