view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • Jul 29 • 167
How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench Paper • 2508.20931 • Published 5 days ago • 12
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training Paper • 2508.17677 • Published 9 days ago • 13
AWorld: Orchestrating the Training Recipe for Agentic AI Paper • 2508.20404 • Published 6 days ago • 37
AudioStory: Generating Long-Form Narrative Audio with Large Language Models Paper • 2508.20088 • Published 6 days ago • 18
Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published 7 days ago • 77
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks Paper • 2508.15804 • Published 20 days ago • 12
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications Paper • 2508.16279 • Published 12 days ago • 18
AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions Paper • 2508.16402 • Published 12 days ago • 14
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 8 days ago • 171
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning Paper • 2508.16949 • Published 11 days ago • 21
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance for Text-to-Image Generation Paper • 2508.18032 • Published 9 days ago • 40
InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles Paper • 2508.16072 • Published 12 days ago • 2
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published 12 days ago • 123
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries Paper • 2508.15760 • Published 12 days ago • 43
Mobile-Agent-v3: Foundamental Agents for GUI Automation Paper • 2508.15144 • Published 13 days ago • 55