view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.29k
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 248
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated Jul 21 • 155
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 134
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24 • 121
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 457
view article Article SigLIP 2: A better multilingual vision language encoder By ariG23498 and 2 others • Feb 21 • 179
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 535
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 182
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 242
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 878
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 298
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 284
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 64
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 150