Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.04692

This collection contains all the PixArt-Sigma related models, spaces and so on.

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7, 2024 • 40
Runtime error

234

234

PixArt Sigma

👁
PixArt-alpha/PixArt-Sigma-XL-2-1024-MS

Text-to-Image • Updated May 4, 2024 • 6.28k • • 86
PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers

Text-to-Image • Updated Apr 7, 2024 • 5.98k • 33

Pix2Gif: Motion-Guided Diffusion for GIF Generation

Paper • 2403.04634 • Published Mar 7, 2024 • 17
StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7, 2024 • 28
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 48
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 63

Video/Image/Gif/etc.

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 88
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27, 2024 • 191
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1, 2024 • 46
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7, 2024 • 40

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7, 2024 • 40

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Paper • 2403.00483 • Published Mar 1, 2024 • 15
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4, 2024 • 30
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21, 2024 • 22
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19, 2024 • 48

Learning Continuous 3D Words for Text-to-Image Generation

Paper • 2402.08654 • Published Feb 13, 2024 • 12
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7, 2024 • 40

MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 47
A Touch, Vision, and Language Dataset for Multimodal Alignment

Paper • 2402.13232 • Published Feb 20, 2024 • 15
Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20, 2024 • 95
FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Paper • 2402.13251 • Published Feb 20, 2024 • 15

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Paper • 2312.09608 • Published Dec 15, 2023 • 16
CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image

Paper • 2310.17994 • Published Oct 27, 2023 • 8
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

Paper • 2401.02677 • Published Jan 5, 2024 • 23

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 112
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 73
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 33

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 58
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Paper • 2311.12229 • Published Nov 20, 2023 • 27
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 50
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Paper • 2312.00845 • Published Dec 1, 2023 • 39

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs