The Flux model with NF4 transformer and T5 encoder.

Usage

pip install bitsandbytes
from diffusers import FluxPipeline
import torch
pipeline = FluxPipeline.from_pretrained("eramth/flux-4bit",torch_dtype=torch.float16).to("cuda")
# This allows you to generate higher resolution images without much extra VRAM usage.
pipeline.vae.enable_tiling()
image = pipeline(prompt="a cute cat",num_inference_steps=25,guidance_scale=3.5).images[0]
image

You can create this quantization model yourself by

from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig
from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig
from diffusers import FluxPipeline,FluxTransformer2DModel
from transformers import T5EncoderModel
import torch

token = ""
repo_id = ""

quant_config = TransformersBitsAndBytesConfig(load_in_4bit=True,bnb_4bit_compute_dtype=torch.float16,bnb_4bit_quant_type="nf4")

text_encoder_2_4bit = T5EncoderModel.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    subfolder="text_encoder_2",
    quantization_config=quant_config,
    torch_dtype=torch.float16,
    token=token
)

quant_config = DiffusersBitsAndBytesConfig(load_in_4bit=True,bnb_4bit_compute_dtype=torch.float16,bnb_4bit_quant_type="nf4")

transformer_4bit = FluxTransformer2DModel.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    subfolder="transformer",
    quantization_config=quant_config,
    torch_dtype=torch.float16,
    token=token
)

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    transformer=transformer_4bit,
    text_encoder_2=text_encoder_2_4bit,
    torch_dtype=torch.float16,
    token=token
)

pipe.push_to_hub(repo_id,token=token)
Downloads last month
58
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for eramth/flux-4bit

Finetuned
(486)
this model