new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Dec 2

Submitted by

Zhaorun

GRAPE: Generalizing Robot Policy via Preference Alignment

·
9 authors

Submitted by

toshas

Video Depth without Video Models

·
8 authors

Submitted by

Jinyang23

Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

·
6 authors

Submitted by

chujiezheng

Yi-Lightning Technical Report

·
42 authors

Submitted by

daixuancheng

On Domain-Specific Post-Training for Multimodal Large Language Models

·
8 authors

Submitted by

kinam0252

Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

·
5 authors

Submitted by

dinobby

Reverse Thinking Makes LLMs Stronger Reasoners

·
11 authors

Submitted by

akhaliq

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

·
9 authors

Submitted by

happy0612

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion

·
7 authors

Submitted by

akhaliq

Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

·
24 authors

Submitted by

zeqixiao

Trajectory Attention for Fine-grained Video Motion Control

·
7 authors

Submitted by

akhaliq

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

·
7 authors

Submitted by

arkimjh

Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

·
4 authors

Submitted by

junwann

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

·
8 authors

Submitted by

Vishnou

MATATA: a weak-supervised MAthematical Tool-Assisted reasoning for Tabular Applications

·
3 authors

Submitted by

akhaliq

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

·
8 authors

Submitted by

emozilla

DeMo: Decoupled Momentum Optimization

·
3 authors

Submitted by

TajaKuzman

LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

·
2 authors

Submitted by

hyz317

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

·
7 authors

Submitted by

zzt76

SpotLight: Shadow-Guided Object Relighting via Diffusion

·
6 authors

Submitted by

ethanrao1

Training Noise Token Pruning

·
3 authors