--- license: creativeml-openrail-m tags: - not-for-all-audiences pipeline_tag: text-to-image --- # FluffyRock Unbound v1.1 (**NEW**) A finetune resumed from Fluffyrock Unleashed v1.0, with the following changes: ### Technical changes: - Adaptive timestep weighting: Timesteps are weighted using a similar method to what the EDM2 paper used, according to the homoscedastic uncertainty of MSE loss on each timestep, thereby equalizing the contribution of each timestep. Loss weight was also conditioned on resolution in order to equalize the contribution of each resolution group. The overall effect of this is that the model is now very good at both high- and low-frequency details, and is not as biased towards blurry backgrounds. - EMA weights were assembled post-hoc using the method described in the EDM2 paper. The checkpoint shipped uses an EMA length sigma of 0.225. - Cross-attention masking was applied to extra completely empty blocks of CLIP token embeddings, making the model work better with short prompts. Previously, if an image had a short caption, it would be fed in similarly to if you had added `BREAK BREAK BREAK` to the prompt in A1111, which caused the model to depend on those extra blocks and made it produce better images with 225 tokens of input. The model is no longer dependent on this. - Optimizer replaced with schedule-free AdamW, and weight decay was turned off in bias layers, which has greatly stabilized training. ### Data input changes: - Low resolution images were removed from higher-resolution buckets. This resulted in removal of approximately 1/3 of images from the highest resolution group. From our testing, we have observed no negative impact on high res generation quality, and this should improve fine details on high res images. - The tokenizer used for training inputs was set up to never split tags down the middle. If a tag would go to the edge of the block, it will now be moved to the next block. This is similar to how most frontends behave. - Random dropout is now applied to implied tags. The overall effect of this change should be that more specific tags will be more powerful and less dependent on implied tags, but more general tags will still be present and usable. ### Dataset Changes: - A sizeable overhaul of E621 tagging was done, removing several useless tags and renaming others. We are including new tag files that represent the current state of the dataset. ----
![An orange wyvern stands over a stream, generated with FluffyRock Unleashed.](example-1.webp)
# FluffyRock Unleashed v1.0 A finetune of lodestones/fluffyrock-1088-minsnr-zsnr-vpred-ema-pytorch with the following changes: - Caption dropout of 10% to improve classifier-free guidance - Timesteps weighted by 1/(snr+1) to improve output quality - A higher virtual batch size of 64 - Includes RedRocket/furception_vae for convenience (licensed separately, please refer to its repository for more information) Like its predecessor, this is a v-prediction, zero terminal SNR model. Please use the provided .yaml file (or do whatever your preferred frontend requires you to do to load a model as v-prediction and zero terminal SNR) so the model can load correctly, or you will have a lot of trouble generating correct outputs! **NEW Feb 11 2024**: Fluffyrock Unleashed Refiner has been released. Use it as a refiner model for the last 20% of generations for better fine details at no additional compute cost. For best results, use the default noise schedule. The VAE decoder is licensed under Furception's license (as of writing CC-BY-NC-SA 4.0 though due to change). All other model components are licensed under the original terms of SD 1.5 (CreativeML OpenRAIL-M). Special thanks to @RedHotTensors for general assistance.