echarlaix HF staff commited on
Commit
8c140d1
2 Parent(s): 4a5d77b ffd13a1

merge main in branch

Browse files
README.md CHANGED
@@ -48,7 +48,7 @@ The SDXL base model performs significantly better than the previous variants, an
48
 
49
  ### 🧨 Diffusers
50
 
51
- Make sure to upgrade diffusers to >= 0.18.0:
52
  ```
53
  pip install diffusers --upgrade
54
  ```
@@ -58,7 +58,8 @@ In addition make sure to install `transformers`, `safetensors`, `accelerate` as
58
  pip install invisible_watermark transformers accelerate safetensors
59
  ```
60
 
61
- You can use the model then as follows
 
62
  ```py
63
  from diffusers import DiffusionPipeline
64
  import torch
@@ -74,6 +75,48 @@ prompt = "An astronaut riding a green horse"
74
  images = pipe(prompt=prompt).images[0]
75
  ```
76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
78
  ```py
79
  pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
@@ -87,6 +130,58 @@ instead of `.to("cuda")`:
87
  + pipe.enable_model_cpu_offload()
88
  ```
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
  ## Uses
92
 
@@ -117,4 +212,4 @@ The model was not trained to be factual or true representations of people or eve
117
  - The autoencoding part of the model is lossy.
118
 
119
  ### Bias
120
- While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
 
48
 
49
  ### 🧨 Diffusers
50
 
51
+ Make sure to upgrade diffusers to >= 0.19.0:
52
  ```
53
  pip install diffusers --upgrade
54
  ```
 
58
  pip install invisible_watermark transformers accelerate safetensors
59
  ```
60
 
61
+ To just use the base model, you can run:
62
+
63
  ```py
64
  from diffusers import DiffusionPipeline
65
  import torch
 
75
  images = pipe(prompt=prompt).images[0]
76
  ```
77
 
78
+ To use the whole base + refiner pipeline as an ensemble of experts you can run:
79
+
80
+ ```py
81
+ from diffusers import DiffusionPipeline
82
+ import torch
83
+
84
+ # load both base & refiner
85
+ base = DiffusionPipeline.from_pretrained(
86
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
87
+ )
88
+ base.to("cuda")
89
+ refiner = DiffusionPipeline.from_pretrained(
90
+ "stabilityai/stable-diffusion-xl-refiner-1.0",
91
+ text_encoder_2=base.text_encoder_2,
92
+ vae=base.vae,
93
+ torch_dtype=torch.float16,
94
+ use_safetensors=True,
95
+ variant="fp16",
96
+ )
97
+ refiner.to("cuda")
98
+
99
+ # Define how many steps and what % of steps to be run on each experts (80/20) here
100
+ n_steps = 40
101
+ high_noise_frac = 0.8
102
+
103
+ prompt = "A majestic lion jumping from a big stone at night"
104
+
105
+ # run both experts
106
+ image = base(
107
+ prompt=prompt,
108
+ num_inference_steps=n_steps,
109
+ denoising_end=high_noise_frac,
110
+ output_type="latent",
111
+ ).images
112
+ image = refiner(
113
+ prompt=prompt,
114
+ num_inference_steps=n_steps,
115
+ denoising_start=high_noise_frac,
116
+ image=image,
117
+ ).images[0]
118
+ ```
119
+
120
  When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
121
  ```py
122
  pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
 
130
  + pipe.enable_model_cpu_offload()
131
  ```
132
 
133
+ For more information on how to use Stable Diffusion XL with `diffusers`, please have a look at [the Stable Diffusion XL Docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/stable_diffusion_xl).
134
+
135
+ ### Optimum
136
+ [Optimum](https://github.com/huggingface/optimum) provides a Stable Diffusion pipeline compatible with both [OpenVINO](https://docs.openvino.ai/latest/index.html) and [ONNX Runtime](https://onnxruntime.ai/).
137
+
138
+ #### OpenVINO
139
+
140
+ To install Optimum with the dependencies required for OpenVINO :
141
+
142
+ ```bash
143
+ pip install optimum[openvino]
144
+ ```
145
+
146
+ To load an OpenVINO model and run inference with OpenVINO Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `OVStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the OpenVINO format on-the-fly, you can set `export=True`.
147
+
148
+ ```diff
149
+ - from diffusers import StableDiffusionPipeline
150
+ + from optimum.intel import OVStableDiffusionPipeline
151
+
152
+ model_id = "stabilityai/stable-diffusion-xl-base-1.0"
153
+ - pipeline = StableDiffusionPipeline.from_pretrained(model_id)
154
+ + pipeline = OVStableDiffusionPipeline.from_pretrained(model_id)
155
+ prompt = "A majestic lion jumping from a big stone at night"
156
+ image = pipeline(prompt).images[0]
157
+ ```
158
+
159
+ You can find more examples (such as static reshaping and model compilation) in optimum [documentation](https://huggingface.co/docs/optimum/main/en/intel/inference#stable-diffusion-xl).
160
+
161
+
162
+ #### ONNX
163
+
164
+ To install Optimum with the dependencies required for ONNX Runtime inference :
165
+
166
+ ```bash
167
+ pip install optimum[onnxruntime]
168
+ ```
169
+
170
+ To load an ONNX model and run inference with ONNX Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `ORTStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the ONNX format on-the-fly, you can set `export=True`.
171
+
172
+ ```diff
173
+ - from diffusers import StableDiffusionPipeline
174
+ + from optimum.onnxruntime import ORTStableDiffusionPipeline
175
+
176
+ model_id = "stabilityai/stable-diffusion-xl-base-1.0"
177
+ - pipeline = StableDiffusionPipeline.from_pretrained(model_id)
178
+ + pipeline = ORTStableDiffusionPipeline.from_pretrained(model_id)
179
+ prompt = "A majestic lion jumping from a big stone at night"
180
+ image = pipeline(prompt).images[0]
181
+ ```
182
+
183
+ You can find more examples in optimum [documentation](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models#stable-diffusion-xl).
184
+
185
 
186
  ## Uses
187
 
 
212
  - The autoencoding part of the model is lossy.
213
 
214
  ### Bias
215
+ While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
text_encoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e27bafa0b3029ad637ef3ace24ce1efe85b8d0dbd22e03a2e70bda6fc88963a1
3
+ size 492587457
text_encoder_2/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:162042ac6556e73f93d4172d4c67532c1cbe4dc7a6a8fa7e44dd2e3d7cbb772b
3
+ size 1041992
text_encoder_2/model.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3da7ac65349fbd092e836e3eeca2c22811317bc804fd70af157b4550f2d4bcb5
3
+ size 2778639360
unet/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f001c090fb13c0d0f8b0a5916da814712a94400b99471fabe77c1c4a51ecaaf
3
+ size 7293842
unet/model.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7905b71f0044c5ea8fea8ca0451bd73cad53492ad50f964c49c3ff9250afa350
3
+ size 10269854720
vae_decoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0892c5e28b35791140467f7b9c9fa148c24238a5f0c381b1d4c22dcd2ed365cb
3
+ size 198093688
vae_encoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b117fbb21531efd59d68c95682392785999bf3e0c2ce95647c6e0de9af36e74
3
+ size 136775724