Submitted by YerbaPage 74 LongCodeZip: Compress Long Context for Code Language Models Stanford University 61 5
Submitted by cuijiaxing 66 Self-Forcing++: Towards Minute-Scale High-Quality Video Generation ByteDance Seed 37 3
Submitted by yulunliu 52 StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions National Yang Ming Chiao Tung University 44 2
Submitted by yuntian-deng 36 Interactive Training: Feedback-Driven Neural Network Optimization Yuntian Group 8 3
Submitted by taesiri 27 StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? · 7 authors 3
Submitted by ruohao 19 Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks · 6 authors 2
Submitted by invokerliang 19 CLUE: Non-parametric Verification from Experience via Hidden-State Clustering Tencent 1
Submitted by lr10260 19 VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning Tencent 2
Submitted by xw-eric 18 The Unreasonable Effectiveness of Scaling Agents for Computer Use Simular 6.44k 2
Submitted by weiminwang 16 Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Character.AI 121 2
Submitted by songw-zju 15 RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Westlake University 21 2
Submitted by Geralt-Targaryen 13 F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data CodeFuse AI 26 2
Submitted by taesiri 13 A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports · 12 authors 2
Submitted by zhangchenxu 11 TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments IBM 41 3
Submitted by Shilin-LU 10 DragFlow: Unleashing DiT Priors with Region Based Supervision for Drag Editing · 7 authors 2
Submitted by AdamF92 8 Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction Reactive AI 0 2
Submitted by Harold328 7 Go with Your Gut: Scaling Confidence for Autoregressive Image Generation · 7 authors 11 2
Submitted by yxl66666 7 Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow · 11 authors 3 1
Submitted by YuZeng260 6 Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models · 12 authors 10 3
Submitted by erjui 6 Automated Structured Radiology Report Generation with Rich Clinical Context · 6 authors 2 3
Submitted by zorik 6 Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs Technion Israel institute of technology 2
Submitted by enisimsar 5 Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity · 3 authors 4 2
Submitted by SteveZeyuZhang 5 VLA-R1: Enhancing Reasoning in Vision-Language-Action Models · 6 authors 10 2
Submitted by Ksgk-fy 4 RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems · 7 authors 2
Submitted by yanxi-chen 4 Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends · 8 authors 357 2
Submitted by Yalimu 3 One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient · 5 authors 2
Submitted by taesiri 3 SKYLENAGE Technical Report: Mathematical Reasoning and Contest-Innovation Benchmarks for Multi-Level Math Evaluation · 18 authors 2
Submitted by James-WYang 2 Parallel Scaling Law: Unveiling Reasoning Generalization through A Cross-Linguistic Perspective Chinese Academic of Science Institute of Automation 2
Submitted by Wyattz23 2 TimeSeriesScientist: A General-Purpose AI Agent for Time Series Analysis · 7 authors 2
Submitted by patricebechard 2 Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval ServiceNow-AI 2
Submitted by zzhao0104 2 Controlled Generation for Private Synthetic Text Center for Language and Speech Processing @ JHU 2
Submitted by Xiaoye08 2 FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting · 6 authors 7 3
Submitted by tetrisd 1 Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation University College London 0 2
Submitted by taesiri 1 MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs · 20 authors 2
Submitted by nandan523 1 Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? New York University 2
Submitted by whats2000 1 SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval · 3 authors 2
Submitted by dinobby - Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression · 6 authors 2
Submitted by pranamanam - AReUReDi: Annealed Rectified Updates for Refining Discrete Flows with Multi-Objective Guidance · 3 authors 2
Submitted by therem - IoT-MCP: Bridging LLMs and IoT Systems Through Model Context Protocol · 10 authors 2