Collections
Discover the best community collections!
Collections including paper arxiv:2408.03178
-
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion
Paper • 2408.03178 • Published • 39 -
VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers
Paper • 2408.17131 • Published • 11 -
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
Paper • 2412.09262 • Published • 1
-
GECO: Generative Image-to-3D within a SECOnd
Paper • 2405.20327 • Published • 10 -
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Paper • 2406.03184 • Published • 22 -
NPGA: Neural Parametric Gaussian Avatars
Paper • 2405.19331 • Published • 10 -
Unified Text-to-Image Generation and Retrieval
Paper • 2406.05814 • Published • 16
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 29 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 131 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 38
-
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
Time Machine GPT
Paper • 2404.18543 • Published • 2 -
Diffusion for World Modeling: Visual Details Matter in Atari
Paper • 2405.12399 • Published • 30 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Paper • 2312.09608 • Published • 16 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 69 -
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper • 2310.17994 • Published • 8 -
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 23
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 19 -
Noise-Aware Training of Layout-Aware Language Models
Paper • 2404.00488 • Published • 10 -
Streaming Dense Video Captioning
Paper • 2404.01297 • Published • 13