Beyond Quantity: Trajectory Diversity Scaling for Code Agents Paper • 2602.03219 • Published Feb 3 • 2
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration Paper • 2603.29557 • Published 3 days ago • 14
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration Paper • 2603.29557 • Published 3 days ago • 14
IPBench: Benchmarking the Knowledge of Large Language Models in Intellectual Property Paper • 2504.15524 • Published Apr 22, 2025 • 3
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 110
AutoPatent: A Multi-Agent Framework for Automatic Patent Generation Paper • 2412.09796 • Published Dec 13, 2024 • 2
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models Paper • 2406.12386 • Published Jun 18, 2024 • 1