ostris
/

vae-kl-f8-d16

Model card Files Files and versions Community

ostris commited on Jul 2, 2024

Commit

b0de423

·

verified ·

1 Parent(s): d2e658c

Update README.md

Files changed (1) hide show

README.md +31 -1

README.md CHANGED Viewed

@@ -1,10 +1,40 @@
 ---
 license: mit
 library_name: diffusers
 ---
 # Ostris VAE - KL-f8-d16
 A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images.
-Testing in progress. more to come.

 ---
 license: mit
 library_name: diffusers
+model-index:
+  - name: 16ch-VAE
+    results:
+      - task:
+          type: encoder-loss
+        dataset:
+          name: yerevann/coco-karpathy
+          type: image
+        metrics:
+          - name: PSNR
+            type: PSNR
+            value: 31.1663
 ---
 # Ostris VAE - KL-f8-d16
 A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images.
+It is lighter weight that most VAEs with only 57,266,643 parameters (vs SD3 VAE: 83,819,683) which means it is faster and uses less VRAM yet scores quite similarly
+on real images. Plus it is MIT licensed so you can do whatever you want with it.
+| VAE|PSNR (higher better)| LPIPS (lower better) | # params |
+|----|----|----|----|
+| sd-vae-ft-mse|26.939|0.0581|83,653,863|
+| SDXL|27.370|0.0540|83,653,863|
+| SD3|31.681|0.0187|83,819,683|
+| **Ostris KL-f8-d16** |**31.166**|**0.0198**|**57,266,643**|
+### What do I do with this?
+If you don't know, you probably don't need this. This is made as an open source lighter version of a 16ch vae.
+You would need to train it into a network before it is useful. I plan to do this myself for SD 1.5, SDXL, and possibly pixart.
+[Follow me on Twitter](https://x.com/ostrisai) to keep up with my work on that.
+### Note: Not SD3 compatable
+This VAE is not SD3 compatable as it is trained from scratch and has an entirely different latent space.