-
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper ā¢ 2401.00935 ā¢ Published ā¢ 18 -
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
Paper ā¢ 2401.00909 ā¢ Published ā¢ 10 -
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image
Paper ā¢ 2401.01117 ā¢ Published ā¢ 10 -
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Paper ā¢ 2401.01173 ā¢ Published ā¢ 12
Collections
Discover the best community collections!
Collections including paper arxiv:2401.06105
-
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
Paper ā¢ 2312.04655 ā¢ Published ā¢ 21 -
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Paper ā¢ 2312.07536 ā¢ Published ā¢ 20 -
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Paper ā¢ 2312.08128 ā¢ Published ā¢ 15 -
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
Paper ā¢ 2312.07661 ā¢ Published ā¢ 19
-
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
Paper ā¢ 2312.12491 ā¢ Published ā¢ 70 -
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Paper ā¢ 2401.11708 ā¢ Published ā¢ 30 -
Training-Free Consistent Text-to-Image Generation
Paper ā¢ 2402.03286 ā¢ Published ā¢ 67 -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper ā¢ 2401.06105 ā¢ Published ā¢ 49
-
DeepCache: Accelerating Diffusion Models for Free
Paper ā¢ 2312.00858 ā¢ Published ā¢ 24 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper ā¢ 2312.00079 ā¢ Published ā¢ 17 -
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper ā¢ 2312.04410 ā¢ Published ā¢ 15 -
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Paper ā¢ 2312.11392 ā¢ Published ā¢ 20
-
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Paper ā¢ 2208.12242 ā¢ Published ā¢ 11 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper ā¢ 2308.06721 ā¢ Published ā¢ 30 -
h94/IP-Adapter-FaceID
Text-to-Image ā¢ Updated ā¢ 400k ā¢ 1.67k -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper ā¢ 2401.06105 ā¢ Published ā¢ 49
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper ā¢ 2306.07967 ā¢ Published ā¢ 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper ā¢ 2306.07954 ā¢ Published ā¢ 112 -
TryOnDiffusion: A Tale of Two UNets
Paper ā¢ 2306.08276 ā¢ Published ā¢ 73 -
Seeing the World through Your Eyes
Paper ā¢ 2306.09348 ā¢ Published ā¢ 33
-
FreeU: Free Lunch in Diffusion U-Net
Paper ā¢ 2309.11497 ā¢ Published ā¢ 65 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper ā¢ 2311.12092 ā¢ Published ā¢ 23 -
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper ā¢ 2311.13600 ā¢ Published ā¢ 45 -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper ā¢ 2401.06105 ā¢ Published ā¢ 49
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper ā¢ 2310.16656 ā¢ Published ā¢ 44 -
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images
Paper ā¢ 2310.16825 ā¢ Published ā¢ 33 -
Matryoshka Diffusion Models
Paper ā¢ 2310.15111 ā¢ Published ā¢ 42 -
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models
Paper ā¢ 2311.04145 ā¢ Published ā¢ 35
-
DreamLLM: Synergistic Multimodal Comprehension and Creation
Paper ā¢ 2309.11499 ā¢ Published ā¢ 58 -
FoleyGen: Visually-Guided Audio Generation
Paper ā¢ 2309.10537 ā¢ Published ā¢ 9 -
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper ā¢ 2310.11441 ā¢ Published ā¢ 28 -
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper ā¢ 2311.10093 ā¢ Published ā¢ 58