-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 73 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2401.04925
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 42 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 22 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 38
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 20 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 64 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 40 -
Attention Is All You Need
Paper • 1706.03762 • Published • 55
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 55 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 21 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 7
-
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Paper • 2401.05033 • Published • 18 -
Towards Conversational Diagnostic AI
Paper • 2401.05654 • Published • 19
-
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 40 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 115
-
Unicron: Economizing Self-Healing LLM Training at Scale
Paper • 2401.00134 • Published • 11 -
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
Paper • 2401.00788 • Published • 22 -
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
Paper • 2401.04398 • Published • 24 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 18