Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.19437

about 3 hours ago

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 24 days ago • 67
Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 106
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 16 days ago • 271
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

Research Papers

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

Learn: LLM Architecture 2025

RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 11
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

about 1 month ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

Video Creation by Demonstration

Paper • 2412.09551 • Published Dec 12, 2024 • 9
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published Dec 6, 2024 • 38

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 106
Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published Dec 6, 2024 • 47
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 46
Yi-Lightning Technical Report

Paper • 2412.01253 • Published Dec 2, 2024 • 27

Agentic AI systems

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published Jul 1, 2024 • 59
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1, 2024 • 86
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Paper • 2412.20005 • Published Dec 28, 2024 • 17
Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Paper • 2407.02477 • Published Jul 2, 2024 • 22

LLM Technical Report

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 344
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 141
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Paper • 2409.12122 • Published Sep 18, 2024 • 3
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 44

Specific Models

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 256
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30, 2024 • 117
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 45

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs