Collections
Discover the best community collections!
Collections including paper arxiv:2412.01064
-
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 111 -
Video-Guided Foley Sound Generation with Multimodal Controls
Paper • 2411.17698 • Published • 8 -
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait
Paper • 2412.01064 • Published • 26 -
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Paper • 2412.01169 • Published • 12
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 22 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 130 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 36