ds4sd/SmolDocling-256M-preview Image-Text-to-Text β’ 0.3B β’ Updated 10 days ago β’ 41.4k β’ 1.56k
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org β’ 6 items β’ Updated Jul 23 β’ 129
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other β’ Jun 3 β’ 71
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Text Generation β’ 33B β’ Updated Feb 24 β’ 1.87M β’ β’ 1.44k
Running 3.15k 3.15k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
view article Article Visual Document Retrieval Goes Multilingual By marco and 1 other β’ Jan 10 β’ 75
view article Article Preference Optimization for Vision Language Models By qgallouedec and 3 others β’ Jul 10, 2024 β’ 80
Running on Zero 1.99k 1.99k Chat With Janus-Pro-7B π A unified multimodal understanding and generation model.