Diffusers

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.35.1).

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

HiDreamImageTransformer2DModel

A Transformer model for image-like data from HiDream-I1.

The model can be loaded with the following code snippet.

from diffusers import HiDreamImageTransformer2DModel

transformer = HiDreamImageTransformer2DModel.from_pretrained("HiDream-ai/HiDream-I1-Full", subfolder="transformer", torch_dtype=torch.bfloat16)

Loading GGUF quantized checkpoints for HiDream-I1

GGUF checkpoints for the HiDreamImageTransformer2DModel can be loaded using ~FromOriginalModelMixin.from_single_file

import torch
from diffusers import GGUFQuantizationConfig, HiDreamImageTransformer2DModel

ckpt_path = "https://huggingface.co/city96/HiDream-I1-Dev-gguf/blob/main/hidream-i1-dev-Q2_K.gguf"
transformer = HiDreamImageTransformer2DModel.from_single_file(
    ckpt_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16
)

HiDreamImageTransformer2DModel

class diffusers.HiDreamImageTransformer2DModel

< source >

( patch_size: typing.Optional[int] = None in_channels: int = 64 out_channels: typing.Optional[int] = None num_layers: int = 16 num_single_layers: int = 32 attention_head_dim: int = 128 num_attention_heads: int = 20 caption_channels: typing.List[int] = None text_emb_dim: int = 2048 num_routed_experts: int = 4 num_activated_experts: int = 2 axes_dims_rope: typing.Tuple[int, int] = (32, 32) max_resolution: typing.Tuple[int, int] = (128, 128) llama_layers: typing.List[int] = None force_inference_output: bool = False )

Transformer2DModelOutput

class diffusers.models.modeling_outputs.Transformer2DModelOutput

< source >

( sample: torch.Tensor )

Parameters

sample (torch.Tensor of shape (batch_size, num_channels, height, width) or (batch size, num_vector_embeds - 1, num_latent_pixels) if Transformer2DModel is discrete) — The hidden states output conditioned on the encoder_hidden_states input. If discrete, returns probability distributions for the unnoised latent pixels.

The output of Transformer2DModel.

Update on GitHub

←FluxTransformer2DModel HunyuanDiT2DModel→