new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Sep 3

Submitted by

henggg

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

·
25 authors

Submitted by

lovesnowbest

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

·
106 authors

Submitted by

SivilTaram

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

·
7 authors

Submitted by

taesiri

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

·
7 authors

Submitted by

HLSv

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding

·
8 authors

Submitted by

DongfuJiang

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

·
12 authors

Submitted by

YuanLiuuuuuu

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

·
11 authors

3

Submitted by

fairyang

Baichuan-M2: Scaling Medical Capability with Large Verifier System

·
34 authors

1

Submitted by

taesiri

Kwai Keye-VL 1.5 Technical Report

·
60 authors

Submitted by

hammh0a

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

·
3 authors

1

Submitted by

Geaming

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

·
8 authors

Submitted by

dogtooth

Jointly Reinforcing Diversity and Quality in Language Model Generations

·
8 authors

1

Submitted by

rishiraj

Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling

·
1 authors

Submitted by

nsjain

DynaGuard: A Dynamic Guardrail Model With User-Defined Policies

·
10 authors

Submitted by

Xiaoyu521

GenCompositor: Generative Video Compositing with Diffusion Transformer

·
7 authors

Submitted by

yangshui

DCPO: Dynamic Clipping Policy Optimization

·
7 authors

Submitted by

Yanqing0327

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

·
7 authors

Submitted by

Andron00e

Benchmarking Optimizers for Large Language Model Pretraining

·
3 authors

Submitted by

kwangju

Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

·
3 authors

1

Submitted by

ahnpersie

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

·
7 authors

Submitted by

che111

M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

·
8 authors

Submitted by

yulongchen

The Gold Medals in an Empty Room: Diagnosing Metalinguistic Reasoning in LLMs with Camlang

·
6 authors

1

Submitted by

eliebak

Fantastic Pretraining Optimizers and Where to Find Them

·
4 authors

Submitted by

pbelcak

Universal Deep Research: Bring Your Own Model and Strategy

·
2 authors

Submitted by

quandao10

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

·
9 authors

1

Submitted by

orionweller

On the Theoretical Limitations of Embedding-Based Retrieval

·
4 authors

1

Submitted by

amanchadha

AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models

·
8 authors

1

Submitted by

zhangganlin

ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association

·
4 authors

Submitted by

amanchadha

SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

·
3 authors

1

Submitted by

fengerhu

MobiAgent: A Systematic Framework for Customizable Mobile Agents

·
10 authors

Submitted by

xianbao

Metis: Training Large Language Models with Advanced Low-Bit Quantization

·
16 authors

Submitted by

theresiavr

Stairway to Fairness: Connecting Group and Individual Fairness

·
5 authors

Submitted by

evanking

Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices

·
5 authors

Submitted by

taesiri

MedDINOv3: How to adapt vision foundation models for medical image segmentation?

·
5 authors

Submitted by

kenantang

Flaw or Artifact? Rethinking Prompt Sensitivity in Evaluating LLMs

·
6 authors

Submitted by

taesiri

Improving Large Vision and Language Models by Learning from a Panel of Peers

·
5 authors

Submitted by

aHapBean

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

·
3 authors

Submitted by

Bekhouche

C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection

·
6 authors

1

Submitted by

zhengchong

FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models

·
10 authors