exdysa commited on
Commit
caf6cbf
1 Parent(s): 3a1ed93

init readme update

Browse files
Files changed (1) hide show
  1. README.md +30 -62
README.md CHANGED
@@ -14,54 +14,41 @@ language:
14
  ---
15
  # CommonCanvas-XL-NC
16
 
17
- ## Summary
18
- CommonCanvas is a family of latent diffusion models capable of generating images from a given text prompt. The architecture is based off of Stable Diffusion XL. Different CommonCanvas models are trained exclusively on subsets of the CommonCatalog Dataset (See Data Card), a large dataset of Creative Commons licensed images with synthetic captions produced using a pre-trained BLIP-2 captioning model.
19
-
20
  **Input:** CommonCatalog Text Captions
21
  **Output:** CommonCatalog Images
22
  **Architecture:** Stable Diffusion XL
23
  **Version Number:** 0.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- The goal of this purpose is to produce a model that is competitive with Stable Diffusion XL, but to do so using an easily accessible dataset of known provenance. Doing so makes replicating the model significantly easier and provides proper attribution to all the creative commons work used to train the model. The exact training recipe of the model can be found in the paper hosted at this link. https://arxiv.org/abs/2310.16825
26
-
27
- ## Performance Limitations
28
-
29
- CommonCanvas under-performs in several categories, including faces, general photography, and paintings (see paper, Figure 8). These datasets all originated from the Conceptual Captions dataset, which relies on web-scraped data. These web-sourced captions, while abundant, may not always align with human-generated language nuances. Transitioning to synthetic captions introduces certain performance challenges, however, the drop in performance is not as dramatic as one might assume.
30
-
31
- ## Training Dataset Limitations
32
- The model is trained on 10 year old YFCC data and may not have modern concepts or recent events in its training corpus. Performance on this model will be worse on certain proper nouns or specific celebrities, but this is a feature not a bug. The model may not generate known artwork, individual celebrities, or specific locations due to the autogenerated nature of the caption data.
33
-
34
-
35
- Note: The non-commercial variants of this model are explicitly not intended to be use
36
- * It is trained on data derived from the Flickr100M dataset. The information is dated and known to have a bias towards internet connected Western countries. Some areas such as the global south lack representation.
37
-
38
- ## Associated Risks
39
- * Text in images produced by the model will likely be difficult to read.
40
- * The model struggles with more complex tasks that require compositional understanding
41
- * It may not accurately generate faces or representations of specific people.
42
- * The model primarily learned from English descriptions and may not perform as effectively in other languages.
43
- * The autoencoder aspect of the model introduces some information loss.
44
- * It may be possible to guide the model to generate objectionable content, i.e. nudity or other NSFW material.
45
-
46
- ## Intended Uses
47
- * Using the model for generative AI research
48
- * Safe deployment of models which have the potential to generate harmful content.
49
- * Probing and understanding the limitations and biases of generative models.
50
- * Generation of artworks and use in design and other artistic processes.
51
- * Applications in educational or creative tools.
52
- * Research on generative models.
53
-
54
- ## Unintended Uses
55
- * Commercial Uses
56
-
57
- ## Usage
58
- We recommend using the MosaicML Diffusion Repo to finetune / train the model: https://github.com/mosaicml/diffusion.
59
- Example finetuning code coming soon.
60
-
61
- ### Spaces demo
62
- Try the model demo on [Hugging Face Spaces](https://huggingface.co/spaces/common-canvas/CommonCanvas)
63
-
64
- ### Inference with 🧨 diffusers
65
 
66
  ```py
67
  from diffusers import StableDiffusionXLPipeline
@@ -74,22 +61,3 @@ pipe = StableDiffusionXLPipeline.from_pretrained(
74
  prompt = "a cat sitting in a car seat"
75
  image = pipe(prompt, num_inference_steps=25).images[0]
76
  ```
77
- ### Inference with ComfyUI / AUTOMATIC1111
78
-
79
- [Download safetensors ⬇️](https://huggingface.co/common-canvas/CommonCanvas-XLNC/resolve/main/commoncanvas_xl_nc.safetensors?download=true)
80
-
81
- ## Evaluation/Validation
82
- We validated the model against Stability AI’s SD2 model and compared human user study
83
-
84
- ## Acknowledgements
85
- We thank @multimodalart, @Wauplin, and @lhoestq at Hugging Face for helping us host the dataset, and model weights.
86
-
87
- ## Citation
88
- ```
89
- @article{gokaslan2023commoncanvas,
90
- title={CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images},
91
- author={Gokaslan, Aaron and Cooper, A Feder and Collins, Jasmine and Seguin, Landan and Jacobson, Austin and Patel, Mihir and Frankle, Jonathan and Stephenson, Cory and Kuleshov, Volodymyr},
92
- journal={arXiv preprint arXiv:2310.16825},
93
- year={2023}
94
- }
95
- ```
 
14
  ---
15
  # CommonCanvas-XL-NC
16
 
17
+ ## Specifications
 
 
18
  **Input:** CommonCatalog Text Captions
19
  **Output:** CommonCatalog Images
20
  **Architecture:** Stable Diffusion XL
21
  **Version Number:** 0.1
22
+ **Credit:** CommonCanvas, StabilityAI, mosaicML, @multimodalart, @Wauplin, @lhoestq
23
+ **NSFW:** Yes
24
+ **Text:** https://arxiv.org/abs/2310.16825
25
+ **LICENSE:**
26
+ <p xmlns:cc="http://creativecommons.org/ns#" >This work is licensed under <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">CC BY-NC-SA 4.0
27
+ <img style="height:22px!important;margin-left:3px;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" label="creative commons license logo" alt="creative commons license logo">
28
+ <img style="height:22px!important;margin-left:3px;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt="terms of license logo">
29
+ <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/nc.svg?ref=chooser-v1" alt="non-commercial use logo">
30
+ <img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt="share alike logo"></a></p>
31
+
32
+ ## Details
33
+ * training data : Flickr100M dataset
34
+ * bias : internet connected Western countries
35
+ * limitations : text generation, complex composition, faces, non-English languages, VAE
36
+ * use : research, deployment, examination, art, education, creative use
37
+ * prohibited : commercial use
38
+ * suggested training : mosaicML https://github.com/mosaicml/diffusion.
39
+
40
+ *
41
+ * ## Citation
42
+ ```
43
+ @article{gokaslan2023commoncanvas,
44
+ title={CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images},
45
+ author={Gokaslan, Aaron and Cooper, A Feder and Collins, Jasmine and Seguin, Landan and Jacobson, Austin and Patel, Mihir and Frankle, Jonathan and Stephenson, Cory and Kuleshov, Volodymyr},
46
+ journal={arXiv preprint arXiv:2310.16825},
47
+ year={2023}
48
+ }
49
+ ```
50
 
51
+ ### Code
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ```py
54
  from diffusers import StableDiffusionXLPipeline
 
61
  prompt = "a cat sitting in a car seat"
62
  image = pipe(prompt, num_inference_steps=25).images[0]
63
  ```