Model generates noise on the bfloat16 datatype on CPU.

#123
by 0x6b64 - opened

Sample code:

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.bfloat16, token=token)
pipe.to(torch.bfloat16)
image = pipe(
  "A cat holding a sign that says hello world",
  negative_prompt="",
  num_inference_steps=27,
  guidance_scale=7.0,
).images[0]

Generated image:
cpu_image.png

0x6b64 changed discussion status to closed

DO NOT RUN THIS CODE! It will give Error: "LayerNormKernelImpl" not implemented for 'Half' until you reinstall the model.

Whew!

This code worked without reinstalling:

from diffusers import StableDiffusion3Pipeline
import torch

model_name = 'stabilityai/stable-diffusion-3-medium-diffusers'

try:
# Load the model pipeline with torch.float32 for operations that require it
pipe = StableDiffusion3Pipeline.from_pretrained(
model_name,
text_encoder_3=None,
tokenizer_3=None,
torch_dtype=torch.float32 # Use torch.float32 (instead of torch.bfloat16)
)

# Example input text and inference parameters
input_text = "A cat holds a makeshift sign that reads: 'Hello World!'' The background is bustling with oblivious passersby, completely unaware of this feline ambassador's bold declaration to the universe."
negative_prompt = ""
num_inference_steps = 28
guidance_scale = 7.0

# Perform inference
result = pipe(
input_text,
negative_prompt=negative_prompt,
num_inference_steps=num_inference_steps,
guidance_scale=guidance_scale,
)

# Retrieve and save the generated image
generated_image = result.images[0]
generated_image.save("cat_image.jpg")

print("Image saved successfully.")

except Exception as e:
print(f"Error: {e}")

cat_image.jpg

Sign up or log in to comment