Finetuned Stable Diffusion for Shoe Image Generation
This repository hosts a finetuned Stable Diffusion model specialized in generating high-quality shoe images. The model is trained on the UT Zappos50K dataset, which consists of over 50,000 catalog-style shoe images with clean backgrounds.
Model Overview
- Base model: Stable Diffusion (v1.4/v2, specify exact version)
- Finetuned on the UT Zappos50K dataset of shoe images
- Generates product-style shoe images with consistent framing and high detail
- Ideal for concepting shoe designs, data augmentation, and e-commerce visual content
Usage
Install dependencies:
pip install diffusers transformers accelerate torch
Example Python code for inference:
import torch
from diffusers import StableDiffusionPipeline
model_id = "benisonjac/finetune-of-stable-diffuson-on-Zappos-shoe-dataset"
pipe = StableDiffusionPipeline.from_pretrained(
model_id,
torch_dtype=torch.float16,
safety_checker=None,
use_safetensors=True
).to("cuda")
prompt = "studio product photo of a black leather ankle boot, side view, isolated on white background, high detail"
negative_prompt = "blurry, low resolution, watermark, deformed"
image = pipe(
prompt,
negative_prompt=negative_prompt,
guidance_scale=7.5,
num_inference_steps=30,
generator=torch.Generator("cuda").manual_seed(42)
).images
image.save("generated_shoe.png")
Prompting Tips
- Use phrases like "studio product photo," "isolated on white background," and "high detail" to emphasize the product catalog style.
- Specify shoe type, color, material, and view angle to guide generation.
- Use negative prompts such as "blurry," "watermark," and "deformed" to improve image clarity.
Dataset
The model is finetuned on the UT Zappos50K dataset, which contains over 50,000 images of shoes including boots, sandals, slippers, and more, all with consistent white backgrounds and catalog styling.
Limitations
- Best suited for single shoe objects with clean backgrounds.
- May produce artifacts or less realistic results for complex scenes or multiple objects.
- Not intended for non-shoe or human image generation.
- Safety checker disabled by default; use caution for downstream deployment.
License and Citation
Please adhere to the licensing terms of the base Stable Diffusion model and the UT Zappos50K dataset when using or redistributing this finetuned model.
If you use this model in your work, please cite the UT Zappos50K dataset and the Stable Diffusion original model.
Acknowledgements
- Stable Diffusion and Hugging Face Diffusers
- UT Austin Zappos50K dataset creators
- Hugging Face Hub for model hosting infrastructure
Feel free to open issues for questions or improvement suggestions.
- Downloads last month
- 4
Model tree for benisonjac/finetune-of-stable-diffuson-on-Zappos-shoe-dataset
Base model
stable-diffusion-v1-5/stable-diffusion-v1-5