Collections
Discover the best community collections!
Collections including paper arxiv:2411.07232
-
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Paper • 2411.07232 • Published • 64 -
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering
Paper • 2408.09702 • Published • 11 -
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion
Paper • 2412.14462 • Published • 15 -
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation
Paper • 2412.08645 • Published • 11
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 54 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 70 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper • 2411.04709 • Published • 25 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 42
-
CLEAR: Character Unlearning in Textual and Visual Modalities
Paper • 2410.18057 • Published • 200 -
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation
Paper • 2410.23090 • Published • 54 -
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Paper • 2410.23743 • Published • 60 -
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Paper • 2411.02355 • Published • 47
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 64 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 16 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 58
-
Zero-shot Image Editing with Reference Imitation
Paper • 2406.07547 • Published • 32 -
The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing
Paper • 2406.10601 • Published • 66 -
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Paper • 2407.05282 • Published • 13 -
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Paper • 2407.16982 • Published • 41
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 22 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 130 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 36