Finetuned Stable Diffusion for Shoe Image Generation

This repository hosts a finetuned Stable Diffusion model specialized in generating high-quality shoe images. The model is trained on the UT Zappos50K dataset, which consists of over 50,000 catalog-style shoe images with clean backgrounds.

Model Overview

Base model: Stable Diffusion (v1.4/v2, specify exact version)
Finetuned on the UT Zappos50K dataset of shoe images
Generates product-style shoe images with consistent framing and high detail
Ideal for concepting shoe designs, data augmentation, and e-commerce visual content

Usage

Install dependencies:

pip install diffusers transformers accelerate torch

Example Python code for inference:

import torch
from diffusers import StableDiffusionPipeline

model_id = "benisonjac/finetune-of-stable-diffuson-on-Zappos-shoe-dataset"

pipe = StableDiffusionPipeline.from_pretrained(
model_id,
torch_dtype=torch.float16,
safety_checker=None,
use_safetensors=True
).to("cuda")

prompt = "studio product photo of a black leather ankle boot, side view, isolated on white background, high detail"
negative_prompt = "blurry, low resolution, watermark, deformed"

image = pipe(
prompt,
negative_prompt=negative_prompt,
guidance_scale=7.5,
num_inference_steps=30,
generator=torch.Generator("cuda").manual_seed(42)
).images

image.save("generated_shoe.png")

Prompting Tips

Use phrases like "studio product photo," "isolated on white background," and "high detail" to emphasize the product catalog style.
Specify shoe type, color, material, and view angle to guide generation.
Use negative prompts such as "blurry," "watermark," and "deformed" to improve image clarity.

Dataset

The model is finetuned on the UT Zappos50K dataset, which contains over 50,000 images of shoes including boots, sandals, slippers, and more, all with consistent white backgrounds and catalog styling.

Limitations

Best suited for single shoe objects with clean backgrounds.
May produce artifacts or less realistic results for complex scenes or multiple objects.
Not intended for non-shoe or human image generation.
Safety checker disabled by default; use caution for downstream deployment.

License and Citation

Please adhere to the licensing terms of the base Stable Diffusion model and the UT Zappos50K dataset when using or redistributing this finetuned model.

If you use this model in your work, please cite the UT Zappos50K dataset and the Stable Diffusion original model.

Acknowledgements

Stable Diffusion and Hugging Face Diffusers
UT Austin Zappos50K dataset creators
Hugging Face Hub for model hosting infrastructure

Feel free to open issues for questions or improvement suggestions.

benisonjac
/

finetune-of-stable-diffuson-on-Zappos-shoe-dataset