suzushi's picture
Update README.md
121cc9f verified
---
license: openrail++
language:
- en
library_name: diffusers
pipeline_tag: text-to-image
tags:
- text-to-image
base_model:
- stabilityai/stable-diffusion-xl-base-1.0
---
# Anime Stable Diffusion Model
A custom Stable Diffusion model fine-tuned for anime-style image generation, trained on a large dataset of anime images.
This is the first concept model for the entire series as I am spending more time filtering and processing the
larger dataset. Currently the model is still undertrained, while it can reflect certain notions, a lot of additional
improvements need to be done.
## Prompt
Danbooru style tagging.
Quality tag: Masterpiece, high quality, normal quality, low quality
Aesthetic tag: Very aesthetic, aesthetic, pleasent, unpleasent
Additional special tag: High resolution, elegant, artist:
| Rating Modifier | Rating Criterion |
| --------------- | ---------------- |
| - | general |
| - | sensitive |
| nsfw | questionable |
| nsfw | explicit |
Recommanded prompt order: Rating tag, quality tag, aesthetic tag, (additional tag), general tag
### Dataset Specifications
- Total Images: 172k
- General Training Set: 160k images
- Aesthetic Fine-tuning Set: 12k high-quality images
- Resolution: 1024x1024
### Hardware Configuration
- GPUs: 2x NVIDIA RTX 6000 Ada
- Training Time: 16 days (General), 3 days (Aesthetic fine tune)
### Training Configuration
| Parameter | Value | Description |
|-----------|--------|-------------|
| Resolution | 1024x1024 | Training resolution |
| Batch Size | 8x2x2 | Effective batch size |
| Learning Rate | 5e-5 | Base learning rate |
| Text Encoder LR | 1e-5 | Learning rate for text encoder |
| Epochs | 10 | Total training epochs |
| Mixed Precision | FP16 | Training precision mode |
| Optimizer | AdamW8bit | Optimizer type |
### Advanced Settings
| Feature | Setting | Purpose |
|---------|---------|----------|
| Gradient Checkpointing | Enabled | Memory optimization |
| XFormers | Enabled | Attention optimization |
| Memory Efficient Attention | Enabled | Memory optimization |
| Bucket Resolution Steps | 128 | Dynamic resolution handling |
| Min Bucket Resolution | 512 | Minimum image size |
| Max Bucket Resolution | 4096 | Maximum image size |
| Noise Offset | 0.035 | Training stability |
| Min SNR Gamma | 5 | Signal-to-noise ratio control |