-
Cosmos World Foundation Model Platform for Physical AI
Paper • 2501.03575 • Published • 67 -
Phi-4 Technical Report
Paper • 2412.08905 • Published • 106 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 271 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2412.19437
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 9 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 45 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 71 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Phi-4 Technical Report
Paper • 2412.08905 • Published • 106 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper • 2412.05210 • Published • 47 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 46 -
Yi-Lightning Technical Report
Paper • 2412.01253 • Published • 27
-
Agentless: Demystifying LLM-based Software Engineering Agents
Paper • 2407.01489 • Published • 59 -
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper • 2407.01370 • Published • 86 -
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System
Paper • 2412.20005 • Published • 17 -
Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Paper • 2407.02477 • Published • 22
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 344 -
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 141 -
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
Paper • 2409.12122 • Published • 3 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 44
-
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 256 -
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
Paper • 2404.14047 • Published • 45 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 45