Collections
Discover the best community collections!
Collections including paper arxiv:2406.11931
-
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models
Paper • 2404.03543 • Published • 16 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 62 -
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Paper • 2407.18901 • Published • 33 -
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
Paper • 2408.07060 • Published • 42
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 83 -
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 62 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 57
-
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
Paper • 2404.11565 • Published • 15 -
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Paper • 2406.06563 • Published • 18 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 62
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 43 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 31 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 126
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 8 -
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation
Paper • 2311.00272 • Published • 10 -
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Paper • 2312.04474 • Published • 31 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82
-
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 69 -
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning
Paper • 2311.02303 • Published • 5 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 22 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 80
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 38 -
RMT: Retentive Networks Meet Vision Transformers
Paper • 2309.11523 • Published • 33 -
Guiding a Diffusion Model with a Bad Version of Itself
Paper • 2406.02507 • Published • 16 -
Jina CLIP: Your CLIP Model Is Also Your Text Retriever
Paper • 2405.20204 • Published • 35
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 23 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 11