-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 106 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
Make Your LLM Fully Utilize the Context
Paper • 2404.16811 • Published • 54 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 96
Collections
Discover the best community collections!
Collections including paper arxiv:2402.10200
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 106 -
Large Language Models Cannot Self-Correct Reasoning Yet
Paper • 2310.01798 • Published • 36 -
Premise Order Matters in Reasoning with Large Language Models
Paper • 2402.08939 • Published • 28 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13
-
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
LLM Critics Help Catch LLM Bugs
Paper • 2407.00215 • Published -
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper • 2407.21787 • Published • 13 -
Generative Verifiers: Reward Modeling as Next-Token Prediction
Paper • 2408.15240 • Published • 13
-
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 21 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 77 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 19 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 106 -
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
Paper • 2410.12375 • Published • 4 -
NousResearch/DeepHermes-3-Llama-3-8B-Preview-GGUF
Updated • 9.88k • 71
-
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
Paper • 2407.00653 • Published • 12 -
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Paper • 2406.18629 • Published • 42 -
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Paper • 2406.14562 • Published • 28 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 30