new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Oct 10

Submitted by

taesiri

Agent Learning via Early Experience

metaresearch

8

Submitted by

PhoenixZ

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

·
14 authors

Submitted by

Blue-Giant

MemMamba: Rethinking Memory Patterns in State Space Model

·
5 authors

Submitted by

taesiri

UniVideo: Unified Understanding, Generation, and Editing for Videos

·
8 authors

Submitted by

taesiri

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

KwaiVGI

Kuaishou Visual Generation and Interaction Center

Submitted by

Blue-Giant

From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning

·
5 authors

Submitted by

starsuzi

When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

amazon

2

Submitted by

yjyjyj98

Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning

kaist-ai

Submitted by

jackzhang

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Submitted by

binxia

DreamOmni2: Multimodal Instruction-based Editing and Generation

·
13 authors

Submitted by

Carlanlarkk

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

tencent

Submitted by

tqfang229

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

·
13 authors

2

Submitted by

Kylin-ll

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Submitted by

taesiri

Training-Free Group Relative Policy Optimization

tencent

Submitted by

UML

ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Submitted by

tsq2000

DeepPrune: Parallel Scaling without Inter-trace Redundancy

THU-KEG

Knowledge Engineer Group @ Tsinghua University

Submitted by

olafyiii

First Try Matters: Revisiting the Role of Reflection in Reasoning Models

·
6 authors

Submitted by

Foreshhh

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Fudan-University

Fudan University

2

Submitted by

YOKIMIYA

UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution

KwaiVGI

Kuaishou Visual Generation and Interaction Center

3

Submitted by

Changyao

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

OpenGVLab

Submitted by

xxyQwQ

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

·
10 authors

Submitted by

SoroushMehraban

PickStyle: Video-to-Video Style Transfer with Context-Style Adapters

Pickford

Submitted by

Adapter

InstructX: Towards Unified Visual Editing with MLLM Guidance

·
8 authors

Submitted by

ZetangForward

LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling

SUDA

Soochow University

Submitted by

canqin001

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Salesforce

Submitted by

Wayne-lc

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

KnowledgeXLab

KnowledgeXLab@Shanghai AI Lab

2

Submitted by

Guan123

Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction

apple

2

Submitted by

Luo-Yihong

Reinforcing Diffusion Models by Direct Group Preference Optimization

·
3 authors

Submitted by

taesiri

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

·
8 authors

Submitted by

worstcoder

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

·
10 authors

Submitted by

xiangh

Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

·
14 authors

Submitted by

lliutianc

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

·
7 authors

Submitted by

ChonghuaLiao

Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

·
4 authors

Submitted by

hyc2026

Memory Retrieval and Consolidation in Large Language Models through Function Tokens

ByteDance-Seed

Submitted by

Mr-Philo

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

·
8 authors

2

Submitted by

Co2y

UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

·
7 authors

Submitted by

xymeow7

DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model

·
3 authors

Submitted by

zfj1998

A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning

CityUniversityofHongKong

City University of Hong Kong

Submitted by

Ach0

GCPO: When Contrast Fails, Go Gold

tencent

Submitted by

Franck-Dernoncourt

Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs

·
7 authors

Submitted by

cfahlgren1

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

·
9 authors

Submitted by

xuxw98

R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation

·
7 authors

Submitted by

ytgui

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

·
2 authors

Submitted by

jiahaoplus

Drive&Gen: Co-Evaluating End-to-End Driving and Video Generation Models

·
14 authors

Submitted by

Saleh

Beyond Outliers: A Study of Optimizers Under Quantization

SPCL

Scalable Parallel Computing Laboratory (SPCL)

Submitted by

andreasengelhardt

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

stabilityai

Submitted by

paischer101

GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

JKU

Johannes Kepler University

Submitted by

XiaRho

Towards Scalable and Consistent 3D Editing

·
3 authors

Submitted by

ahmedhendawy19

Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning

·
6 authors

Submitted by

ryancll118

Fidelity-Aware Data Composition for Robust Robot Generalization

·
9 authors