Rethinking the Value of Agent-Generated Tests for LLM-Based Software Engineering Agents Paper • 2602.07900 • Published 5 days ago • 4
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents Paper • 2602.07035 • Published 10 days ago • 29
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 11 days ago • 92
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 14 days ago • 176
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper • 2601.16746 • Published 21 days ago • 89
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published Jan 11 • 211
GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts Paper • 2601.05110 • Published Jan 8 • 29
Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents Paper • 2512.08870 • Published Dec 9, 2025 • 4
HyperAgent: Leveraging Hypergraphs for Topology Optimization in Multi-Agent Communication Paper • 2510.10611 • Published Oct 12, 2025 • 5
GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search Paper • 2510.10581 • Published Oct 12, 2025 • 3
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1, 2025 • 107
Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Paper • 2509.26628 • Published Sep 30, 2025 • 17
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? Paper • 2509.16941 • Published Sep 21, 2025 • 21
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18, 2025 • 35
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal Paper • 2508.05988 • Published Aug 8, 2025 • 21
EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation Paper • 2508.04295 • Published Aug 6, 2025 • 7
SWE-Exp: Experience-Driven Software Issue Resolution Paper • 2507.23361 • Published Jul 31, 2025 • 14
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper • 2507.23348 • Published Jul 31, 2025 • 12