AutoMLGen: Navigating Fine-Grained Optimization for Coding Agents Paper • 2510.08511 • Published Oct 9, 2025
Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models Paper • 2512.06281 • Published Dec 6, 2025
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves Paper • 2505.02831 • Published May 5, 2025 • 1
Distribution Matching Distillation Meets Reinforcement Learning Paper • 2511.13649 • Published Nov 17, 2025 • 2
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 224
No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves Paper • 2505.02831 • Published May 5, 2025 • 1
Deforming Videos to Masks: Flow Matching for Referring Video Segmentation Paper • 2510.06139 • Published Oct 7, 2025 • 2
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 54
EVTAR: End-to-End Try on with Additional Unpaired Visual Reference Paper • 2511.00956 • Published Nov 2, 2025 • 4
Distribution Matching Distillation Meets Reinforcement Learning Paper • 2511.13649 • Published Nov 17, 2025 • 2
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 224