-
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Paper ā¢ 2210.01970 ā¢ Published ā¢ 11 -
Zephyr: Direct Distillation of LM Alignment
Paper ā¢ 2310.16944 ā¢ Published ā¢ 122 -
Datasets: A Community Library for Natural Language Processing
Paper ā¢ 2109.02846 ā¢ Published ā¢ 12 -
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Paper ā¢ 1910.03771 ā¢ Published ā¢ 16
Collections
Discover the best community collections!
Collections including paper arxiv:2310.16944
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 49 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 46
-
Detecting Pretraining Data from Large Language Models
Paper ā¢ 2310.16789 ā¢ Published ā¢ 11 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper ā¢ 2310.13671 ā¢ Published ā¢ 19 -
AutoMix: Automatically Mixing Language Models
Paper ā¢ 2310.12963 ā¢ Published ā¢ 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper ā¢ 2310.12962 ā¢ Published ā¢ 13