ostris commited on
Commit
b0de423
·
verified ·
1 Parent(s): d2e658c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -1,10 +1,40 @@
1
  ---
2
  license: mit
3
  library_name: diffusers
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  # Ostris VAE - KL-f8-d16
7
 
8
  A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images.
9
 
10
- Testing in progress. more to come.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  library_name: diffusers
4
+ model-index:
5
+ - name: 16ch-VAE
6
+ results:
7
+ - task:
8
+ type: encoder-loss
9
+ dataset:
10
+ name: yerevann/coco-karpathy
11
+ type: image
12
+ metrics:
13
+ - name: PSNR
14
+ type: PSNR
15
+ value: 31.1663
16
  ---
17
 
18
  # Ostris VAE - KL-f8-d16
19
 
20
  A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images.
21
 
22
+ It is lighter weight that most VAEs with only 57,266,643 parameters (vs SD3 VAE: 83,819,683) which means it is faster and uses less VRAM yet scores quite similarly
23
+ on real images. Plus it is MIT licensed so you can do whatever you want with it.
24
+
25
+ | VAE|PSNR (higher better)| LPIPS (lower better) | # params |
26
+ |----|----|----|----|
27
+ | sd-vae-ft-mse|26.939|0.0581|83,653,863|
28
+ | SDXL|27.370|0.0540|83,653,863|
29
+ | SD3|31.681|0.0187|83,819,683|
30
+ | **Ostris KL-f8-d16** |**31.166**|**0.0198**|**57,266,643**|
31
+
32
+
33
+ ### What do I do with this?
34
+
35
+ If you don't know, you probably don't need this. This is made as an open source lighter version of a 16ch vae.
36
+ You would need to train it into a network before it is useful. I plan to do this myself for SD 1.5, SDXL, and possibly pixart.
37
+ [Follow me on Twitter](https://x.com/ostrisai) to keep up with my work on that.
38
+
39
+ ### Note: Not SD3 compatable
40
+ This VAE is not SD3 compatable as it is trained from scratch and has an entirely different latent space.