Submitted by henggg 73 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey · 25 authors 62 2
Submitted by lovesnowbest 69 UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning · 106 authors 2
Submitted by SivilTaram 64 SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning · 7 authors 213 2
Submitted by taesiri 61 LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model · 7 authors 4.17k 1
Submitted by HLSv 49 ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding · 8 authors 7 1
Submitted by DongfuJiang 44 VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use · 12 authors 390 1
Submitted by YuanLiuuuuuu 38 POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion · 11 authors 3
Submitted by fairyang 25 Baichuan-M2: Scaling Medical Capability with Large Verifier System · 34 authors 1
Submitted by hammh0a 23 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic · 3 authors 1
Submitted by Geaming 19 Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR · 8 authors 18 2
Submitted by dogtooth 18 Jointly Reinforcing Diversity and Quality in Language Model Generations · 8 authors 1
Submitted by rishiraj 18 Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling · 1 authors 4 1
Submitted by nsjain 15 DynaGuard: A Dynamic Guardrail Model With User-Defined Policies · 10 authors 4 2
Submitted by Xiaoyu521 15 GenCompositor: Generative Video Compositing with Diffusion Transformer · 7 authors 32 4
Submitted by Yanqing0327 14 OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning · 7 authors 305 1
Submitted by Andron00e 13 Benchmarking Optimizers for Large Language Model Pretraining · 3 authors 7 1
Submitted by ahnpersie 12 FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games · 7 authors 7 1
Submitted by kwangju 11 Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation · 3 authors 1
Submitted by che111 9 M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision · 8 authors 1
Submitted by yulongchen 8 The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang · 6 authors 1
Submitted by quandao10 4 Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing · 9 authors 1
Submitted by amanchadha 2 AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models · 8 authors 1
Submitted by zhangganlin 2 ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association · 4 authors 43 1
Submitted by amanchadha 2 SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction · 3 authors 1
Submitted by fengerhu 2 MobiAgent: A Systematic Framework for Customizable Mobile Agents · 10 authors 45 1
Submitted by xianbao 2 Metis: Training Large Language Models with Advanced Low-Bit Quantization · 16 authors 1
Submitted by theresiavr 2 Stairway to Fairness: Connecting Group and Individual Fairness · 5 authors 1 1
Submitted by evanking 1 Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices · 5 authors 2.85k 1
Submitted by taesiri 1 MedDINOv3: How to adapt vision foundation models for medical image segmentation? · 5 authors 9 1
Submitted by kenantang 1 Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs · 6 authors 1
Submitted by taesiri 1 Improving Large Vision and Language Models by Learning from a Panel of Peers · 5 authors 1
Submitted by aHapBean 1 Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views · 3 authors 4 2
Submitted by Bekhouche 1 C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection · 6 authors 1
Submitted by zhengchong 1 FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models · 10 authors 25 1