otinpo
/

vae-kl-f8-d16

Model card Files Files and versions Community

vae-kl-f8-d16 / README.md

subaqua's picture

Duplicate from ostris/vae-kl-f8-d16

fc0d8ea verified 7 months ago

|

1.26 kB

	---
	license: mit
	library_name: diffusers
	---

	# Ostris VAE - KL-f8-d16

	A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images.

	It is lighter weight that most VAEs with only 57,266,643 parameters (vs SD3 VAE: 83,819,683) which means it is faster and uses less VRAM yet scores quite similarly
	on real images. Plus it is MIT licensed so you can do whatever you want with it.

	\| VAE\|PSNR (higher better)\| LPIPS (lower better) \| # params \|
	\|----\|----\|----\|----\|
	\| sd-vae-ft-mse\|26.939\|0.0581\|83,653,863\|
	\| SDXL\|27.370\|0.0540\|83,653,863\|
	\| SD3\|31.681\|0.0187\|83,819,683\|
	\| Ostris KL-f8-d16 \|31.166\|0.0198\|57,266,643\|

	### Compare
	Check out the comparison at [imgsli](https://imgsli.com/Mjc2MjA3).


	### What do I do with this?

	If you don't know, you probably don't need this. This is made as an open source lighter version of a 16ch vae.
	You would need to train it into a network before it is useful. I plan to do this myself for SD 1.5, SDXL, and possibly pixart.
	[Follow me on Twitter](https://x.com/ostrisai) to keep up with my work on that.

	### Note: Not SD3 compatable
	This VAE is not SD3 compatable as it is trained from scratch and has an entirely different latent space.