Collections
Discover the best community collections!
Collections including paper arxiv:2407.01219
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 90 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 77 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 70 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 60
-
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 79 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 94 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 61