awoo

Browse files

Signed-off-by: Balazs Horvath <[email protected]>

Files changed (2) hide show

README.md +85 -10
scripts/sdp_benchmark.py +39 -0

README.md CHANGED Viewed

@@ -24,12 +24,22 @@ The Yiff Toolkit is a comprehensive set of tools designed to enhance your creati
 - [Hotdogwolf's Yiff Toolkit](#hotdogwolfs-yiff-toolkit)
   - [Table of Contents](#table-of-contents)
   - [Embeddings for 1.5 and SDXL](#embeddings-for-15-and-sdxl)
   - [ComfyUI Walkthrough any%](#comfyui-walkthrough-any)
   - [AnimateDiff for Masochists](#animatediff-for-masochists)
   - [Stable Cascade Furry Bible](#stable-cascade-furry-bible)
     - [Resonance Cascade](#resonance-cascade)
   - [SDXL Furry Bible](#sdxl-furry-bible)
     - [Pony Diffusion V6](#pony-diffusion-v6)
       - [Requirements](#requirements)
       - [Positive Prompt Stuff](#positive-prompt-stuff)
@@ -82,9 +92,76 @@ The Yiff Toolkit is a comprehensive set of tools designed to enhance your creati
 </details>
 </div>
-## Embeddings for 1.5 and SDXL
----
 Embeddings in Stable Diffusion are high-dimensional representations of input data, such as images or text, that capture their essential features and relationships. These embeddings are used to guide the diffusion process, enabling the model to generate outputs that closely match the desired characteristics specified in the input.
@@ -92,14 +169,10 @@ You can find in the [`/embeddings`](https://huggingface.co/k4d3/yiff_toolkit/tre
 ## ComfyUI Walkthrough any%
----
 ⚠️ Coming next year! ⚠️
 ## AnimateDiff for Masochists
----
 ⚠️ Coming in 2026! ⚠️
 ## Stable Cascade Furry Bible
@@ -110,18 +183,20 @@ You can find in the [`/embeddings`](https://huggingface.co/k4d3/yiff_toolkit/tre
 ## SDXL Furry Bible
 ### Pony Diffusion V6
 ---
 #### Requirements
-[Resolution Lora](https://huggingface.co/jiaxiangc/res-adapter/resolve/main/sdxl-i/resolution_lora.safetensors?download=true) is a nice thing to have and of course the [model](https://civitai.com/models/257749/pony-diffusion-v6-xl) itself. Regarding the res-lora, it will help with consistency but you might want to turn it off especially when you img2img to higher resolution than 1024x1024 if you notice something really fucked up is happening to your generations.
 #### Positive Prompt Stuff
-You need this thing for what we are about to do..
 ```sd
 score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_furry,
 ```
@@ -155,7 +230,7 @@ Its a good thing to describe your subject or subjects start with `solo` or `duo`
 #### Negative Prompt Stuff
-⚠️ **WARNING: Under construction!** ⚠️
 ### SeaArt Furry

 - [Hotdogwolf's Yiff Toolkit](#hotdogwolfs-yiff-toolkit)
   - [Table of Contents](#table-of-contents)
+  - [LoRA Training Guide](#lora-training-guide)
+    - [Installation Tips](#installation-tips)
+    - [Dataset Preparation](#dataset-preparation)
+    - [Pony Training](#pony-training)
+      - [`--dataset_repeats`](#--dataset_repeats)
+      - [`--max_train_steps`](#--max_train_steps)
+      - [`--shuffle_caption`](#--shuffle_caption)
+      - [`--sdpa`](#--sdpa)
+      - [`--sample_sampler`](#--sample_sampler)
   - [Embeddings for 1.5 and SDXL](#embeddings-for-15-and-sdxl)
   - [ComfyUI Walkthrough any%](#comfyui-walkthrough-any)
   - [AnimateDiff for Masochists](#animatediff-for-masochists)
   - [Stable Cascade Furry Bible](#stable-cascade-furry-bible)
     - [Resonance Cascade](#resonance-cascade)
   - [SDXL Furry Bible](#sdxl-furry-bible)
+    - [Some Common Knowledge Stuff](#some-common-knowledge-stuff)
     - [Pony Diffusion V6](#pony-diffusion-v6)
       - [Requirements](#requirements)
       - [Positive Prompt Stuff](#positive-prompt-stuff)
 </details>
 </div>
+## LoRA Training Guide
+### Installation Tips
+Firstly, download kohya_ss' [sd-scripts](https://github.com/kohya-ss/sd-scripts), you need to set up your environment either like [this](https://github.com/kohya-ss/sd-scripts?tab=readme-ov-file#windows-installation) tells you for Windows, or if you are using Linux or Miniconda on Windows, you are probably smart enough to figure out the installation for it. I recommend always installing the latest [PyTorch](https://pytorch.org/get-started/locally/) in the virtual environment you are going to use, which at the time of writing is `2.2.2`. I hope future me has faster PyTorch!
+If someone told you to install `xformers` call them stinky, because ever since the fused implementation of `sdpa` landed in torch it has been the king of my benchmarks.
+For training you will have to go with either `--sdpa` or `--xformers`
+### Dataset Preparation
+⚠️ **TODO:** Awoo this section.
+### Pony Training
+I'm not going to lie, it is a bit complicated to explain everything. But here is my best attempt going through each line:
+#### `--dataset_repeats`
+Repeats the dataset when training with captions, by default it is set to `1` so we'll set this to `0` with:
+```py
+    --dataset_repeats=0 \
+```
+#### `--max_train_steps`
+Specify the number of steps or epochs to train. If both `--max_train_steps` and `--max_train_epochs` are specified, the number of epochs takes precedence.
+```py
+    --max_train_steps=500 \
+```
+#### `--shuffle_caption`
+Shuffles the captions set by `--caption_separator`, it is a comma `,` by default which will work perfectly for our case since our captions look like this:
+```txt
+rating_questionable, 5 fingers, anthro, bent over, big breasts, blue eyes, blue hair, breasts, butt, claws, curved horn, female, finger claws, fingers, fur, hair, huge breasts, looking at viewer, looking back, looking back at viewer, nipples, nude, pink body, pink hair, pink nipples, rear view, solo, tail, tail tuft, tuft, by lunarii, by x-leon-x, mythology, krystal \(darkmaster781\), dragon, scalie, wickerbeast, The image showcases a pink-scaled wickerbeast a furred dragon creature with blue eyes., She has large breasts and a thick tail., Her blue and pink horns are curved and pointy and she has a slight smiling expression on her face., Her scales are shiny and she has a blue and pink pattern on her body., Her hair is a mix of pink and blue., She is looking back at the viewer with a curious expression., She has a slight blush.,
+```
+As you can tell, I have separated the caption part not just the tags with a `,` to make sure everything gets shuffled. I'm at this point pretty certain this is beneficial especially when your caption file contains more than 77 tokens.
+#### `--sdpa`
+The choice between `--xformers` and `--spda` will depend on your GPU. You can benchmark it by repeating a training with both!
+#### `--sample_sampler`
+You have the option of generating images during training so you can check the progress, the argument let's you pick between different samplers, by default it is on `ddim`, so you better change it!
+ You can also use `--sample_every_n_epochs` instead which will take precedence over steps. The `k_` prefix means karras and the `_a` suffix means ancestral.
+```py
+    --sample_sampler="euler_a" \
+    --sample_every_n_steps=100
+```
+My recommendation for Pony is to use `euler_a` for toony and for realistic `k_dpm_2`.
+Your options include the following:
+```bash
+ddim, pndm, lms, euler, euler_a, heun, dpm_2, dpm_2_a, dpmsolver, dpmsolver++, dpmsingle, k_lms, k_euler, k_euler_a, k_dpm_2, k_dpm_2_a
+```
+```bash
+```
+## Embeddings for 1.5 and SDXL
 Embeddings in Stable Diffusion are high-dimensional representations of input data, such as images or text, that capture their essential features and relationships. These embeddings are used to guide the diffusion process, enabling the model to generate outputs that closely match the desired characteristics specified in the input.
 ## ComfyUI Walkthrough any%
 ⚠️ Coming next year! ⚠️
 ## AnimateDiff for Masochists
 ⚠️ Coming in 2026! ⚠️
 ## Stable Cascade Furry Bible
 ## SDXL Furry Bible
+### Some Common Knowledge Stuff
+[Resolution Lora](https://huggingface.co/jiaxiangc/res-adapter/resolve/main/sdxl-i/resolution_lora.safetensors?download=true) is a nice thing to have, it will help with consistency. For SDXL it is just a LoRA you can load in and it will do its magic. No need for a custom node or extension in this case.
 ### Pony Diffusion V6
 ---
 #### Requirements
+Download the [model](https://civitai.com/models/257749/pony-diffusion-v6-xl) and load it in to whatever you use to generate models.
 #### Positive Prompt Stuff
 ```sd
 score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_furry,
 ```
 #### Negative Prompt Stuff
+⚠️ **WARNING: Super under construction!** ⚠️
 ### SeaArt Furry

scripts/sdp_benchmark.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import torch
+import torch.nn.functional as F
+import torch.utils.benchmark as benchmark
+from torch.backends.cuda import sdp_kernel, SDPBackend
+device = "cuda" if torch.cuda.is_available() else "cpu"
+def benchmark_torch_function_in_milliseconds(f, *args, **kwargs):
+    t0 = benchmark.Timer(
+        stmt="f(*args, **kwargs)", globals={"args": args, "kwargs": kwargs, "f": f}
+    )
+    return t0.blocked_autorange().mean * 1e3  # Convert to milliseconds
+batch_size = 32
+max_sequence_len = 1024
+num_heads = 32
+embed_dimension = 32
+dtype = torch.float16
+query = torch.rand(batch_size, num_heads, max_sequence_len, embed_dimension, device=device, dtype=dtype)
+key = torch.rand(batch_size, num_heads, max_sequence_len, embed_dimension, device=device, dtype=dtype)
+value = torch.rand(batch_size, num_heads, max_sequence_len, embed_dimension, device=device, dtype=dtype)
+print(f"The default implementation runs in {benchmark_torch_function_in_milliseconds(F.scaled_dot_product_attention, query, key, value):.3f} milliseconds")
+backend_map = {
+    SDPBackend.MATH: {"enable_math": True, "enable_flash": False, "enable_mem_efficient": False},
+    SDPBackend.EFFICIENT_ATTENTION: {"enable_math": False, "enable_flash": False, "enable_mem_efficient": True}
+}
+with sdp_kernel(**backend_map[SDPBackend.MATH]):
+    print(f"The math implementation runs in {benchmark_torch_function_in_milliseconds(F.scaled_dot_product_attention, query, key, value):.3f} milliseconds")
+with sdp_kernel(**backend_map[SDPBackend.EFFICIENT_ATTENTION]):
+    try:
+        print(f"The memory efficient implementation runs in {benchmark_torch_function_in_milliseconds(F.scaled_dot_product_attention, query, key, value):.3f} milliseconds")
+    except RuntimeError:
+        print("EfficientAttention is not supported. See warnings for reasons.")