DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published 17 days ago • 51
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training Paper • 2501.07556 • Published Jan 13 • 5
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published Jan 14 • 57
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published Jan 14 • 33
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 18
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published Nov 22, 2024 • 55
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Paper • 2411.13503 • Published Nov 20, 2024 • 31
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers Paper • 2411.18673 • Published Nov 27, 2024 • 8
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 138