--- base_model: stable-diffusion-v1-5/stable-diffusion-v1-5 library_name: diffusers license: creativeml-openrail-m inference: true tags: - stable-diffusion - stable-diffusion-diffusers - text-to-image - diffusers - diffusers-training - lora datasets: - lambdalabs/naruto-blip-captions --- # LoRA text2image fine-tuning - Bhaskar009/SD_1.5_LoRA These are LoRA adaption weights for stable-diffusion-v1-5/stable-diffusion-v1-5. The weights were fine-tuned on the lambdalabs/naruto-blip-captions dataset. You can find some example images in the following. ![img_0](./image_0.png) ![img_1](./image_1.png) ![img_2](./image_2.png) ![img_3](./image_3.png) ## Intended uses & limitations #### How to use ```python import torch import matplotlib.pyplot as plt from diffusers import DiffusionPipeline # Load the model and move it to GPU (CUDA) pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5").to("cuda") # Load the fine-tuned LoRA weights pipe.load_lora_weights("Bhaskar009/SD_1.5_LoRA") # moving to cuda pipe.to("cuda") # Define a Naruto-themed prompt prompt = "A detailed anime-style portrait of Naruto Uzumaki, wearing his Hokage cloak, standing under a bright sunset, ultra-detailed, cinematic lighting, 8K" # Generate the image image = pipe(prompt).images[0] # Display the image using matplotlib plt.figure(figsize=(6, 6)) plt.imshow(image) plt.axis("off") # Hide axes for a clean view plt.show() ``` #### Limitations and bias [TODO: provide examples of latent issues and potential remediations] ## Training details - Stable Diffusion LoRA # Dataset -The model was trained using the 'lambdalabs/naruto-blip-captions' dataset. -This dataset consists of Naruto character images with BLIP-generated captions. -It provides a diverse set of characters, poses, and backgrounds, -making it suitable for fine-tuning Stable Diffusion on anime-style images. # Model -Base Model: Stable Diffusion v1.5 (stable-diffusion-v1-5/stable-diffusion-v1-5) -Fine-tuning Method: LoRA (Low-Rank Adaptation) -Purpose: Specializing Stable Diffusion to generate Naruto-style anime characters. # Preprocessing - Images were resized to 512x512 resolution. - Center cropping was applied to maintain aspect ratio. - Random flipping was used as a data augmentation technique. # Training Configuration -Batch Size: 1 -Gradient Accumulation Steps: 4 # Simulates a larger batch size -Gradient Checkpointing: Enabled # Reduces memory consumption -Max Training Steps: 800 -Learning Rate: 1e-5 (constant schedule, no warmup) -Max Gradient Norm: 1 # Prevents gradient explosion -Memory Optimization: xFormers enabled for efficient attention computation # Validation - A validation prompt "A Naruto character" was used. - 4 validation images were generated during training. - Model checkpoints were saved every 500 steps. # Model Output - The fine-tuned LoRA model was saved to "sd-naruto-model". - The model was pushed to the Hugging Face Hub: - Repository: Bhaskar009/SD_1.5_LoRA