Collections
Discover the best community collections!
Collections including paper arxiv:2405.12250
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 29 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 131 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 38
-
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Paper • 2404.18796 • Published • 70 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 111 -
The Road Less Scheduled
Paper • 2405.15682 • Published • 27 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 157
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 92 -
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Paper • 2404.10667 • Published • 19 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 26 -
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper • 2402.09353 • Published • 28
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 92 -
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 65 -
Compression Represents Intelligence Linearly
Paper • 2404.09937 • Published • 27 -
Multi-Head Mixture-of-Experts
Paper • 2404.15045 • Published • 60
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 127 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 53 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 14 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 68
-
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Paper • 2402.10644 • Published • 81 -
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Paper • 2401.04658 • Published • 27 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 111 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 157
-
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Paper • 2402.05099 • Published • 20 -
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting
Paper • 2402.13720 • Published • 7 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 32 -
Your Transformer is Secretly Linear
Paper • 2405.12250 • Published • 157
-
SMOTE: Synthetic Minority Over-sampling Technique
Paper • 1106.1813 • Published • 1 -
Scikit-learn: Machine Learning in Python
Paper • 1201.0490 • Published • 1 -
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Paper • 1406.1078 • Published -
Distributed Representations of Sentences and Documents
Paper • 1405.4053 • Published
-
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
Paper • 2402.08714 • Published • 14 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 25 -
RLVF: Learning from Verbal Feedback without Overgeneralization
Paper • 2402.10893 • Published • 12 -
Coercing LLMs to do and reveal (almost) anything
Paper • 2402.14020 • Published • 13