Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.11931

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 62
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

Paper • 2406.11833 • Published Jun 17, 2024 • 62

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 62

Code Generation

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 16
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 62
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26, 2024 • 33
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 42

Foundation Models

OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 83
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 62
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 29
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 57

LLM+Architecture

MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation

Paper • 2404.11565 • Published Apr 17, 2024 • 15
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Paper • 2406.06563 • Published Jun 3, 2024 • 18
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 62

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26, 2024 • 43
InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26, 2024 • 31
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

Paper • 2404.04167 • Published Apr 5, 2024 • 13
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 126

Code Generation

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

Paper • 2310.18628 • Published Oct 28, 2023 • 8
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation

Paper • 2311.00272 • Published Nov 1, 2023 • 10
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 31
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 82

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

Paper • 2311.02303 • Published Nov 4, 2023 • 5
A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 22
Magicoder: Source Code Is All You Need

Paper • 2312.02120 • Published Dec 4, 2023 • 80

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 38
RMT: Retentive Networks Meet Vision Transformers

Paper • 2309.11523 • Published Sep 20, 2023 • 33
Guiding a Diffusion Model with a Bad Version of Itself

Paper • 2406.02507 • Published Jun 4, 2024 • 16
Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published May 30, 2024 • 35

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 23
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 11

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs