Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 24 days ago • 25
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints Paper • 2501.03841 • Published 24 days ago • 53
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 17 days ago • 271
Hallucinations Can Improve Large Language Models in Drug Discovery Paper • 2501.13824 • Published 8 days ago • 8
Return of the Encoder: Maximizing Parameter Efficiency for SLMs Paper • 2501.16273 • Published 4 days ago • 4
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 3 days ago • 48
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling Paper • 2501.16975 • Published 3 days ago • 16
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Fineweb-Edu-Ar Collection Largest (as of 2024) machine translated Arabic educational corpus • 2 items • Updated Dec 16, 2024 • 1
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs Paper • 2412.08347 • Published Dec 11, 2024 • 4
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Paper • 2402.01781 • Published Feb 1, 2024 • 2
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models Paper • 2411.06402 • Published Nov 10, 2024 • 2
SmolTulu Collection A collection of models that use SmolLM2 as the pretrained base in conjunction with AllenAI's Tulu 3 post training pipeline. • 6 items • Updated Dec 17, 2024 • 1
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 2 days ago • 64