suzushi commited on
Commit
121cc9f
·
verified ·
1 Parent(s): bf8cac2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -2
README.md CHANGED
@@ -1,10 +1,71 @@
1
  ---
2
- license: other
3
  language:
4
  - en
5
  library_name: diffusers
6
  pipeline_tag: text-to-image
7
  tags:
8
  - text-to-image
 
 
9
  ---
10
- Converted from [https://huggingface.co/suzushi/miso-diffusion-xl-1.0-safetensors/blob/main/miso-diffusion-xl-1.0.safetensors](https://huggingface.co/suzushi/miso-diffusion-xl-1.0-safetensors/blob/main/miso-diffusion-xl-1.0.safetensors).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: openrail++
3
  language:
4
  - en
5
  library_name: diffusers
6
  pipeline_tag: text-to-image
7
  tags:
8
  - text-to-image
9
+ base_model:
10
+ - stabilityai/stable-diffusion-xl-base-1.0
11
  ---
12
+
13
+ # Anime Stable Diffusion Model
14
+
15
+ A custom Stable Diffusion model fine-tuned for anime-style image generation, trained on a large dataset of anime images.
16
+ This is the first concept model for the entire series as I am spending more time filtering and processing the
17
+ larger dataset. Currently the model is still undertrained, while it can reflect certain notions, a lot of additional
18
+ improvements need to be done.
19
+
20
+ ## Prompt
21
+ Danbooru style tagging.
22
+
23
+ Quality tag: Masterpiece, high quality, normal quality, low quality
24
+ Aesthetic tag: Very aesthetic, aesthetic, pleasent, unpleasent
25
+
26
+ Additional special tag: High resolution, elegant, artist:
27
+
28
+
29
+ | Rating Modifier | Rating Criterion |
30
+ | --------------- | ---------------- |
31
+ | - | general |
32
+ | - | sensitive |
33
+ | nsfw | questionable |
34
+ | nsfw | explicit |
35
+
36
+ Recommanded prompt order: Rating tag, quality tag, aesthetic tag, (additional tag), general tag
37
+
38
+ ### Dataset Specifications
39
+ - Total Images: 172k
40
+ - General Training Set: 160k images
41
+ - Aesthetic Fine-tuning Set: 12k high-quality images
42
+ - Resolution: 1024x1024
43
+
44
+ ### Hardware Configuration
45
+ - GPUs: 2x NVIDIA RTX 6000 Ada
46
+ - Training Time: 16 days (General), 3 days (Aesthetic fine tune)
47
+
48
+ ### Training Configuration
49
+
50
+ | Parameter | Value | Description |
51
+ |-----------|--------|-------------|
52
+ | Resolution | 1024x1024 | Training resolution |
53
+ | Batch Size | 8x2x2 | Effective batch size |
54
+ | Learning Rate | 5e-5 | Base learning rate |
55
+ | Text Encoder LR | 1e-5 | Learning rate for text encoder |
56
+ | Epochs | 10 | Total training epochs |
57
+ | Mixed Precision | FP16 | Training precision mode |
58
+ | Optimizer | AdamW8bit | Optimizer type |
59
+
60
+ ### Advanced Settings
61
+
62
+ | Feature | Setting | Purpose |
63
+ |---------|---------|----------|
64
+ | Gradient Checkpointing | Enabled | Memory optimization |
65
+ | XFormers | Enabled | Attention optimization |
66
+ | Memory Efficient Attention | Enabled | Memory optimization |
67
+ | Bucket Resolution Steps | 128 | Dynamic resolution handling |
68
+ | Min Bucket Resolution | 512 | Minimum image size |
69
+ | Max Bucket Resolution | 4096 | Maximum image size |
70
+ | Noise Offset | 0.035 | Training stability |
71
+ | Min SNR Gamma | 5 | Signal-to-noise ratio control |