Language Models Prefer What They Know: Relative Confidence Estimation via Confidence Preferences Paper • 2502.01126 • Published 4 days ago • 2
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 24 days ago • 53
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 24 days ago • 53
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper • 2501.02045 • Published Jan 3 • 21
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 51
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents Paper • 1911.02557 • Published Nov 6, 2019
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning Paper • 2204.10815 • Published Apr 22, 2022
Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI Paper • 2205.00029 • Published Apr 29, 2022
Training-Free Activation Sparsity in Large Language Models Paper • 2408.14690 • Published Aug 26, 2024
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 39
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 39
NeuralArTS: Structuring Neural Architecture Search with Type Theory Paper • 2110.08710 • Published Oct 17, 2021
Towards One Shot Search Space Poisoning in Neural Architecture Search Paper • 2111.07138 • Published Nov 13, 2021
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees Paper • 2110.03313 • Published Oct 7, 2021 • 1