-
Instruction Tuning for Large Language Models: A Survey
Paper • 2308.10792 • Published • 1 -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Efficient Large Language Models: A Survey
Paper • 2312.03863 • Published • 3 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30
Collections
Discover the best community collections!
Collections including paper arxiv:2407.06023
-
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
LLM Critics Help Catch LLM Bugs
Paper • 2407.00215 • Published -
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper • 2407.21787 • Published • 13 -
Generative Verifiers: Reward Modeling as Next-Token Prediction
Paper • 2408.15240 • Published • 13
-
Instruction Tuning with Human Curriculum
Paper • 2310.09518 • Published • 3 -
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper • 2402.06925 • Published • 1 -
Distilling System 2 into System 1
Paper • 2407.06023 • Published • 3 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 93
-
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 23 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 23 -
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Paper • 2403.18118 • Published • 12 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 79
-
InternLM2 Technical Report
Paper • 2403.17297 • Published • 31 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 41 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 84 -
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data
Paper • 2404.12195 • Published • 12
-
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Paper • 2305.10429 • Published • 3 -
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper • 2403.15042 • Published • 27 -
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Paper • 2405.01535 • Published • 121 -
Discovering Preference Optimization Algorithms with and for Large Language Models
Paper • 2406.08414 • Published • 17
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105 -
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Paper • 2403.14624 • Published • 52 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13