|
|
--- |
|
|
tags: |
|
|
- SanaControlNetPipeline |
|
|
pipeline_tag: text-to-image |
|
|
license: mit |
|
|
--- |
|
|
<p align="center" style="border-radius: 10px"> |
|
|
<img src="https://raw.githubusercontent.com/NVlabs/Sana/refs/heads/main/asset/logo.png" width="35%" alt="logo"/> |
|
|
</p> |
|
|
|
|
|
<div style="display:flex;justify-content: center"> |
|
|
<a href="https://huggingface.co/collections/Efficient-Large-Model/sana-673efba2a57ed99843f11f9e"><img src="https://img.shields.io/static/v1?label=Demo&message=Huggingface&color=yellow"></a>   |
|
|
<a href="https://github.com/NVlabs/Sana"><img src="https://img.shields.io/static/v1?label=Code&message=Github&color=blue&logo=github"></a>   |
|
|
<a href="https://nvlabs.github.io/Sana/"><img src="https://img.shields.io/static/v1?label=Project&message=Github&color=blue&logo=github-pages"></a>   |
|
|
<a href="https://hanlab.mit.edu/projects/sana/"><img src="https://img.shields.io/static/v1?label=Page&message=MIT&color=darkred&logo=github-pages"></a>   |
|
|
<a href="https://arxiv.org/abs/2410.10629"><img src="https://img.shields.io/static/v1?label=Arxiv&message=Sana&color=red&logo=arxiv"></a>   |
|
|
<a href="https://nv-sana.mit.edu/"><img src="https://img.shields.io/static/v1?label=Demo&message=MIT&color=yellow"></a>   |
|
|
<a href="https://discord.gg/rde6eaE5Ta"><img src="https://img.shields.io/static/v1?label=Discuss&message=Discord&color=purple&logo=discord"></a>   |
|
|
</div> |
|
|
|
|
|
# Model card |
|
|
|
|
|
We introduce **Sana**, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. |
|
|
Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU. |
|
|
|
|
|
Source code is available at https://github.com/NVlabs/Sana. |
|
|
|
|
|
|
|
|
### 🧨 Diffusers |
|
|
|
|
|
### 1. How to use `SanaControlNetPipeline` with `🧨diffusers` |
|
|
|
|
|
```python |
|
|
# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers |
|
|
import torch |
|
|
from diffusers import SanaControlNetModel, SanaControlNetPipeline |
|
|
from diffusers.utils import load_image |
|
|
|
|
|
controlnet = SanaControlNetModel.from_pretrained( |
|
|
"ishan24/Sana_600M_1024px_ControlNet_diffusers", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
pipe = SanaControlNetPipeline.from_pretrained( |
|
|
"Efficient-Large-Model/Sana_600M_1024px_diffusers", |
|
|
variant="fp16", |
|
|
controlnet=controlnet, |
|
|
torch_dtype=torch.float16, |
|
|
) |
|
|
|
|
|
pipe.to('cuda') |
|
|
pipe.vae.to(torch.bfloat16) |
|
|
pipe.text_encoder.to(torch.bfloat16) |
|
|
|
|
|
cond_image = load_image( |
|
|
"https://huggingface.co/ishan24/Sana_600M_1024px_ControlNet_diffusers/resolve/main/hed_example.png" |
|
|
) |
|
|
prompt='a cat with a neon sign that says "Sana"' |
|
|
image = pipe( |
|
|
prompt, |
|
|
control_image=cond_image, |
|
|
).images[0] |
|
|
image.save("sana.png") |
|
|
``` |