---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
---

# AnimeBoysXL v3.0

**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**

## Features

- ✔️ **Good for inference**: AnimeBoysXL v3.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
- ✔️ **Good for training**: AnimeBoysXL v3.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.

## Inference Guide

- **Prompt**: Use tag-based prompts to describe your subject.
  - Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
    ```
    1boy, male focus, character name, series name, anything else you'd like to describe, best quality, amazing quality, best aesthetic, absurdres
    ```
    ```
    2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe, best quality, amazing quality, best aesthetic, absurdres
    ```
- **Negative prompt**: Choose from one of the following two presets.
  1. Heavy (*recommended*):
    ```
    lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts
    ```
  2. Light:
    ```
    lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic
    ```
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 8.5)** good. You are encouraged to experiment with other settings.
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.

## 🧨Diffusers Example Usage

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v3.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, absurdres"
negative_prompt = "lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts"

image = pipe(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=8.5,
    num_inference_steps=28
).images[0]
```

## Training Details

AnimeBoysXL v3.0 is trained from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl), on ~516k images.

The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.

### Quality tags

| tag               | score     |
|-------------------|-----------|
| `best quality`    | >= 150    |
| `amazing quality` | [75, 150) |
| `great quality`   | [25, 75)  |
| `normal quality`  | [0, 25)   |
| `bad quality`     | (-5, 0)   |
| `worst quality`   | <= -5     |

### Aesthetic tags

The aesthetic tags of AnimeBoysXL v3.0 reflect my aesthetic preference.

| tag                 |
|---------------------|
| `best aesthetic`    |
| `amazing aesthetic` |
| `great aesthetic`   |
| `normal aesthetic`  |
| `bad aesthetic`     |

### Rating tags

| tag             | rating       |
|-----------------|--------------|
| `sfw`           | general      |
| `slightly nsfw` | sensitive    |
| `fairly nsfw`   | questionable |
| `very nsfw`     | explicit     |

### Year tags

`year YYYY` where `YYYY` is in the range of [2005, 2023].

### Training configurations

- Hardware: 4 * Nvidia A100 80GB GPUs
- Optimizer: AdaFactor
- Gradient accumulation steps: 8
- Batch size: 4 * 8 * 4 = 128
- Learning rates:
  - 8e-6 for U-Net
  - 5.2e-6 for text encoder 1 (CLIP ViT-L)
  - 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
- Learning rate schedule: constant with 250 warmup steps
- Mixed precision training type: FP16
- Epochs: 40

### Changes from v2.0
- Change the base model from Stable Diffusion XL Base 1.0 to Pony Diffusion V6 XL.
- Revamp the dataset's aesthetic tags based on the developer's preference.
- Update the criterion of quality tags.
- Use FP16 mixed-precision training.
- Train for more epochs.

## Special thanks

J for his assistance with the showcase images.

## License

Since AnimeBoysXL v3.0 is a derivative model of [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl) by PurpleSmartAI, it has a different license from the previous versions. Please read their license before using the model.