Act2Goal: From World Model To General Goal-conditioned Policy Paper • 2512.23541 • Published 4 days ago • 21
Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis Paper • 2510.15710 • Published Oct 17, 2025 • 6
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark Paper • 2402.02242 • Published Feb 3, 2024
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published 11 days ago • 3
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Paper • 2512.19433 • Published 11 days ago • 3
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 23 days ago • 77
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 62
Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling Paper • 2507.17801 • Published Jul 23, 2025 • 1
Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation Paper • 2507.13032 • Published Jul 17, 2025
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 54
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding Paper • 2507.14533 • Published Jul 19, 2025 • 6