Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.19332

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 58
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 47
Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Paper • 2406.04594 • Published Jun 7, 2024 • 8
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6, 2024 • 30

ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-3

Text Generation • Updated Jun 8, 2024 • 38 • 1
ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-2

Text Generation • Updated Jun 8, 2024 • 31
ZhangShenao/SELM-Phi-3-mini-4k-instruct-iter-1

Text Generation • Updated Jun 8, 2024 • 32
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29, 2024 • 22

ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3

Text Generation • Updated Jun 8, 2024 • 59 • 5
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-2

Text Generation • Updated Jun 8, 2024 • 19
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-1

Text Generation • Updated Jun 8, 2024 • 14
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29, 2024 • 22

See our paper at https://huggingface.co/papers/2405.19332.

ZhangShenao/SELM-Zephyr-7B-iter-3

Text Generation • Updated Jun 8, 2024 • 23 • 3
ZhangShenao/SELM-Zephyr-7B-iter-2

Text Generation • Updated Jun 8, 2024 • 22
ZhangShenao/SELM-Zephyr-7B-iter-1

Text Generation • Updated Jun 8, 2024 • 18
ZhangShenao/DPO-Zephyr-7B

Text Generation • Updated Jun 8, 2024 • 12

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published May 14, 2024 • 19
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29, 2024 • 22
Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29, 2024 • 14
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2, 2024 • 33

active learning

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets

Paper • 2404.05623 • Published Apr 8, 2024 • 3
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29, 2024 • 22
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM

Paper • 2406.12168 • Published Jun 18, 2024 • 7
Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Paper • 2406.10023 • Published Jun 14, 2024 • 2

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs