-
Attention Is All You Need
Paper • 1706.03762 • Published • 55 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 17 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14
Collections
Discover the best community collections!
Collections including paper arxiv:2109.01652
-
Instruction Tuning for Large Language Models: A Survey
Paper • 2308.10792 • Published • 1 -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Efficient Large Language Models: A Survey
Paper • 2312.03863 • Published • 3 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30
-
INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Paper • 2306.04757 • Published • 6 -
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation
Paper • 2308.01240 • Published • 2 -
Can Large Language Models Understand Real-World Complex Instructions?
Paper • 2309.09150 • Published • 2 -
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection
Paper • 2308.10819 • Published
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 14 -
Finetuned Language Models Are Zero-Shot Learners
Paper • 2109.01652 • Published • 2 -
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 23
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 64 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 40 -
Attention Is All You Need
Paper • 1706.03762 • Published • 55