etri-vilab
/

koala-1b-llava-cap

StableDiffusionXLPipeline

Inference Endpoints

Model card Files Files and versions Community

ywlee88 commited on Jan 15, 2024

Commit

3572527

·

verified ·

1 Parent(s): 7e37a6e

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -25,6 +25,16 @@ So we construct synthesized captions of LAION-aesthetics-V2 6+ by using a large
 KOALA-700M-LLaVA-Caption and KOALA-1B-LLaVA-Caption is trained on the synthesized caption-image pairs of LAION-aesthetics-V2 6+.
 ## Abstract
 ### TL;DR
 > We propose a fast text-to-image model, called KOALA, by compressing SDXL's U-Net and distilling knowledge from SDXL into our model. KOALA-700M can generate a 1024x1024 image in less than 1.5 seconds on an NVIDIA 4090 GPU, which is more than 2x faster than SDXL. KOALA-700M can be used as a decent alternative between SDM and SDXL in limited resources.

 KOALA-700M-LLaVA-Caption and KOALA-1B-LLaVA-Caption is trained on the synthesized caption-image pairs of LAION-aesthetics-V2 6+.
+## KOALA Model Cards
+|Model|link|
+|:--|:--|
+|koala-700m | https://huggingface.co/etri-vilab/koala-700m|
+|koala-700m-llava-cap | https://huggingface.co/etri-vilab/koala-700m-llava-cap|
+|koala-1b | https://huggingface.co/etri-vilab/koala-1bm|
+|koala-1b-llava-cap | https://huggingface.co/etri-vilab/koala-1b-llava-cap|
 ## Abstract
 ### TL;DR
 > We propose a fast text-to-image model, called KOALA, by compressing SDXL's U-Net and distilling knowledge from SDXL into our model. KOALA-700M can generate a 1024x1024 image in less than 1.5 seconds on an NVIDIA 4090 GPU, which is more than 2x faster than SDXL. KOALA-700M can be used as a decent alternative between SDM and SDXL in limited resources.