Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.06023

LLM Post Training

Instruction Tuning for Large Language Models: A Survey

Paper • 2308.10792 • Published Aug 21, 2023 • 1
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

Paper • 2403.14608 • Published Mar 21, 2024
Efficient Large Language Models: A Survey

Paper • 2312.03863 • Published Dec 6, 2023 • 3
ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17, 2024 • 30

LLM Reasoning Papers

improve reasoning capabilities of LLMs

Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10
LLM Critics Help Catch LLM Bugs

Paper • 2407.00215 • Published Jun 28, 2024
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 13
Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13

Papers - Training - Text - Continual Learning

Distilling System 2 into System 1

Paper • 2407.06023 • Published Jul 8, 2024 • 3

Papers - CoT - Branch Solve Merge (BSM)

Distilling System 2 into System 1

Paper • 2407.06023 • Published Jul 8, 2024 • 3
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation

Paper • 2310.15123 • Published Oct 23, 2023 • 8

Papers - CoT - Intermediate Thoughts

Distilling System 2 into System 1

Paper • 2407.06023 • Published Jul 8, 2024 • 3

Papers - Llama 2

Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10, 2024 • 1
Distilling System 2 into System 1

Paper • 2407.06023 • Published Jul 8, 2024 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 93

LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 23
Garment3DGen: 3D Garment Stylization and Texture Generation

Paper • 2403.18816 • Published Mar 27, 2024 • 23
EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Paper • 2403.18118 • Published Mar 26, 2024 • 12
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 79

Papers - Fine-tuning - SFT

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26, 2024 • 31
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 41
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 84
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data

Paper • 2404.12195 • Published Apr 18, 2024 • 12

Papers - Training - AI training AI

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Paper • 2305.10429 • Published May 17, 2023 • 3
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22, 2024 • 27
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 121
Discovering Preference Optimization Algorithms with and for Large Language Models

Paper • 2406.08414 • Published Jun 12, 2024 • 17

Papers - CoT - Chain of Thought

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 39
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 105
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 52
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20, 2024 • 13

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs