Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,133 @@
|
|
1 |
---
|
2 |
license: openrail++
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: openrail++
|
3 |
+
tags:
|
4 |
+
- text-to-image
|
5 |
+
- stable-diffusion
|
6 |
+
- diffusers
|
7 |
---
|
8 |
+
|
9 |
+
# AnimeBoysXL v3.0
|
10 |
+
|
11 |
+
**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**
|
12 |
+
|
13 |
+
## Features
|
14 |
+
|
15 |
+
- ✔️ **Good for inference**: AnimeBoysXL v3.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
|
16 |
+
- ✔️ **Good for training**: AnimeBoysXL v3.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
|
17 |
+
- ❌ AnimeBoysXL v3.0 is not optimized for creating anime girls. Please consider using other models for that purpose.
|
18 |
+
|
19 |
+
## Inference Guide
|
20 |
+
|
21 |
+
- **Prompt**: Use tag-based prompts to describe your subject.
|
22 |
+
- Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
|
23 |
+
```
|
24 |
+
1boy, male focus, character name, series name, anything else you'd like to describe
|
25 |
+
```
|
26 |
+
```
|
27 |
+
2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe
|
28 |
+
```
|
29 |
+
- Append
|
30 |
+
```
|
31 |
+
, best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres
|
32 |
+
```
|
33 |
+
to the prompt to improve image quality.
|
34 |
+
- (*Optional*) Append
|
35 |
+
```
|
36 |
+
, year YYYY
|
37 |
+
```
|
38 |
+
to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
|
39 |
+
- **Negative prompt**: Choose from one of the following two presets.
|
40 |
+
1. Heavy (*recommended*):
|
41 |
+
```
|
42 |
+
lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts
|
43 |
+
```
|
44 |
+
2. Light:
|
45 |
+
```
|
46 |
+
lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic
|
47 |
+
```
|
48 |
+
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
|
49 |
+
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
|
50 |
+
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.
|
51 |
+
|
52 |
+
## 🧨Diffusers Example Usage
|
53 |
+
|
54 |
+
```python
|
55 |
+
import torch
|
56 |
+
from diffusers import DiffusionPipeline
|
57 |
+
|
58 |
+
pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v3.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
|
59 |
+
pipe.to("cuda")
|
60 |
+
|
61 |
+
prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres"
|
62 |
+
negative_prompt = "lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts"
|
63 |
+
|
64 |
+
image = pipe(
|
65 |
+
prompt=prompt,
|
66 |
+
negative_prompt=negative_prompt,
|
67 |
+
width=1024,
|
68 |
+
height=1024,
|
69 |
+
guidance_scale=5,
|
70 |
+
num_inference_steps=28
|
71 |
+
).images[0]
|
72 |
+
```
|
73 |
+
|
74 |
+
## Training Details
|
75 |
+
|
76 |
+
AnimeBoysXL v3.0 is trained from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl), on ~516k images.
|
77 |
+
|
78 |
+
The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.
|
79 |
+
|
80 |
+
### Quality tags
|
81 |
+
|
82 |
+
| tag | score |
|
83 |
+
|-------------------|-----------|
|
84 |
+
| `best quality` | >= 150 |
|
85 |
+
| `amazing quality` | [75, 150) |
|
86 |
+
| `great quality` | [25, 75) |
|
87 |
+
| `normal quality` | [0, 25) |
|
88 |
+
| `bad quality` | (-5, 0) |
|
89 |
+
| `worst quality` | <= -5 |
|
90 |
+
|
91 |
+
### Aesthetic tags
|
92 |
+
|
93 |
+
| tag |
|
94 |
+
|---------------------|
|
95 |
+
| `best aesthetic` |
|
96 |
+
| `amazing aesthetic` |
|
97 |
+
| `great aesthetic` |
|
98 |
+
| `normal aesthetic` |
|
99 |
+
| `bad aesthetic` |
|
100 |
+
|
101 |
+
### Rating tags
|
102 |
+
|
103 |
+
| tag | rating |
|
104 |
+
|-----------------|--------------|
|
105 |
+
| `sfw` | general |
|
106 |
+
| `slightly nsfw` | sensitive |
|
107 |
+
| `fairly nsfw` | questionable |
|
108 |
+
| `very nsfw` | explicit |
|
109 |
+
|
110 |
+
### Year tags
|
111 |
+
|
112 |
+
`year YYYY` where `YYYY` is in the range of [2005, 2023].
|
113 |
+
|
114 |
+
### Training configurations
|
115 |
+
|
116 |
+
- Hardware: 4 * Nvidia A100 80GB GPUs
|
117 |
+
- Optimizer: AdaFactor
|
118 |
+
- Gradient accumulation steps: 8
|
119 |
+
- Batch size: 4 * 8 * 4 = 128
|
120 |
+
- Learning rates:
|
121 |
+
- 8e-6 for U-Net
|
122 |
+
- 5.2e-6 for text encoder 1 (CLIP ViT-L)
|
123 |
+
- 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
|
124 |
+
- Learning rate schedule: constant with 250 warmup steps
|
125 |
+
- Mixed precision training type: FP16
|
126 |
+
- Epochs: 40
|
127 |
+
|
128 |
+
### Changes from v2.0
|
129 |
+
- Change the base model from Stable Diffusion XL Base 1.0 to Pony Diffusion V6 XL.
|
130 |
+
- Revamp the dataset's aesthetic tag based on the developer's preference.
|
131 |
+
- Update quality score and aesthetic score criteria.
|
132 |
+
- Use FP16 mixed-precision training.
|
133 |
+
- Train for more epochs.
|