Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2305.14387

[lecture artifacts] aligning open language models

artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin

LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 14
tatsu-lab/alpaca-7b-wdiff

Text Generation • Updated May 22, 2023 • 210 • 57
lmsys/vicuna-13b-delta-v0

Text Generation • Updated Aug 1, 2023 • 201 • 453
anon8231489123/ShareGPT_Vicuna_unfiltered

Updated Apr 12, 2023 • 19.8k • 770

Papers - Training - Instruction-Following

Alpaca eval: https://github.com/tatsu-lab/alpaca_eval

UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1

Papers - Fine-tuning - PPO

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 21
UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 97

Papers - University - Stanford University

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Paper • 2403.18421 • Published Mar 27, 2024 • 23
Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27, 2024 • 25
stanford-crfm/BioMedLM

Text Generation • Updated Mar 28, 2024 • 5.38k • 416
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 53

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 6
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 53
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 146
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 17

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs