FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU Paper • 2303.06865 • Published Mar 13, 2023 • 1
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning Paper • 2306.00088 • Published May 31, 2023 • 1
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads Paper • 2410.01805 • Published Oct 2, 2024
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild Paper • 2410.05357 • Published Oct 7, 2024
Zero-Indexing Internet Search Augmented Generation for Large Language Models Paper • 2411.19478 • Published Nov 29, 2024
HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow Paper • 2505.05286 • Published May 8, 2025 • 1
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper • 2505.24298 • Published May 30, 2025 • 28
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning Paper • 2506.07227 • Published Jun 8, 2025
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification Paper • 2506.07235 • Published Jun 8, 2025 • 3
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny Paper • 2507.16331 • Published Jul 22, 2025 • 20
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Paper • 2504.10449 • Published Apr 14, 2025 • 15
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14, 2025 • 62
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents Paper • 1911.02557 • Published Nov 6, 2019
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning Paper • 2204.10815 • Published Apr 22, 2022
Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI Paper • 2205.00029 • Published Apr 29, 2022
Training-Free Activation Sparsity in Large Language Models Paper • 2408.14690 • Published Aug 26, 2024
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining Paper • 2410.08102 • Published Oct 10, 2024 • 21
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 42