Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published about 1 month ago • 100
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 17 days ago • 94
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 17 days ago • 64
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 15 days ago • 135
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published 15 days ago • 18
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 15 days ago • 99
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Paper • 2512.23959 • Published 16 days ago • 100
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 10 days ago • 93
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 6 days ago • 150
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 9 days ago • 100
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 130