olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org β’ 3 items β’ Updated 5 days ago β’ 96
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets β’ 9 items β’ Updated 25 days ago β’ 49
Reasoning Datasets Collection Reasoning datasets that are trending π₯ β’ 10 items β’ Updated Jan 3 β’ 24
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. β’ 4 items β’ Updated Jun 27, 2024 β’ 149
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 7 items β’ Updated Feb 10 β’ 76
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper β’ 2402.03620 β’ Published Feb 6, 2024 β’ 115
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper β’ 2407.12327 β’ Published Jul 17, 2024 β’ 78
Symbolic Learning Enables Self-Evolving Agents Paper β’ 2406.18532 β’ Published Jun 26, 2024 β’ 12
Teaching Transformers Causal Reasoning through Axiomatic Training Paper β’ 2407.07612 β’ Published Jul 10, 2024 β’ 2
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper β’ 2407.03502 β’ Published Jul 3, 2024 β’ 50
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Paper β’ 2407.08770 β’ Published Jul 11, 2024 β’ 21
Human-like Episodic Memory for Infinite Context LLMs Paper β’ 2407.09450 β’ Published Jul 12, 2024 β’ 62