[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
-
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Paper • 2506.18898 • Published • 33 -
Tar
🚀47Unified MLLM with Text-Aligned Representations
-
Tar
🚀3Unified MLLM with Text-Aligned Representations
-
Tar
🚀60Unified MLLM with Text-Aligned Representations