new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jan 21

Submitted by

zawnpn

Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization

BeingBeyond

Submitted by

itaowe

Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey

DeepSoftwareAnalytics

Submitted by

williamium

Think3D: Thinking with Space for Spatial Reasoning

·
12 authors

Submitted by

PangzeCheung

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

ByteDance

Submitted by

adwardlee

Toward Efficient Agents: Memory, Tool learning, and Planning

Submitted by

Jinlan

FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs

OpenMOSS-Team

Submitted by

ZetangForward

MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models

SUDA

Soochow University

Submitted by

hengyuanya

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

·
28 authors

2

Submitted by

ZrH42

UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

WuhanUniversity

Wuhan Univeristy

Submitted by

wjldw

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

intuit

2

Submitted by

LulaCola

DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution

RUC

Renmin University of China

Submitted by

Ningyu

Aligning Agentic World Models via Knowledgeable Experience Learning

Zhejiang University

Submitted by

xyma

Agentic-R: Learning to Retrieve for Agentic Search

·
7 authors

Submitted by

lucianodelcorro

A BERTology View of LLM Orchestrations: Token- and Layer-Selective Probes for Efficient Single-Pass Classification

UdeSA

Universidad de San Andrés

2

Submitted by

avanturist

KAGE-Bench: Fast Known-Axis Visual Generalization Evaluation for Reinforcement Learning

·
4 authors

Submitted by

staghado

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

lightonai

2

Submitted by

shikhar7ssu

PRiSM: Benchmarking Phone Realization in Speech Models

changelinglab

Submitted by

frankjiang

FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation

acvlab

Alibaba AMAP CV Lab

Submitted by

Umean

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Fudan-University

Fudan University

Submitted by

JackBAI

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

CMU-AIRe

CMU Artificial Intelligence and Reinforcement Learning (AIRe) Lab

Submitted by

learn3r

Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

alibaba-inc

Submitted by

bilgehanertan

On the Evidentiary Limits of Membership Inference for Copyright Auditing

·
5 authors

2

Submitted by

bilgehanertan

Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD

cwiamsterdam

Centrum Wiskunde & Informatica

2

Submitted by

Stephen-smj

DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems

PolyUHK

The Hong Kong Polytechnic University

Submitted by

MElHuseyni

A Hybrid Protocol for Large-Scale Semantic Dataset Generation in Low-Resource Languages: The Turkish Semantic Relations Corpus

·
4 authors

2

Submitted by

MElHuseyni

Beyond Cosine Similarity: Taming Semantic Drift and Antonym Intrusion in a 15-Million Node Turkish Synonym Graph

·
4 authors

2

Submitted by

paraslossfunk

METIS: Mentoring Engine for Thoughtful Inquiry & Solutions

Lossfunk

Submitted by

timbmg

SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

UKPLab

Ubiquitous Knowledge Processing Lab

Submitted by

nitay

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

Technion

Technion Israel institute of technology

Submitted by

JeremiasTraub

Finally Outshining the Random Baseline: A Simple and Effective Solution for Active Learning in 3D Biomedical Imaging

MIC-DKFZ

Submitted by

VentureZJ

Towards Efficient and Robust Linguistic Emotion Diagnosis for Mental Health via Multi-Agent Instruction Refinement

·
8 authors

Submitted by

yilmazkorkmaz

RemoteVAR: Autoregressive Visual Modeling for Remote Sensing Change Detection

JohnsHopkins

Johns Hopkins University