Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper โข 2501.10893 โข Published 12 days ago โข 22
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper โข 2501.11733 โข Published 10 days ago โข 26
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper โข 2501.12326 โข Published 10 days ago โข 47
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper โข 2501.00958 โข Published 29 days ago โข 99
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper โข 2501.03895 โข Published 24 days ago โข 48
Table Transformer Collection The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. โข 5 items โข Updated 23 days ago โข 21