RuntimeError: expected scalar type Half but found Float

#20
by AI-JAC - opened

Hi,

I am trying to run stable diffusion on Windows with a 8GB CUDA GPU. However, 8 GB are not enough to run with standard parameters:

RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 6.19 GiB already allocated; 0 bytes free; 6.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Therefore I followed the Note "If you are limited by GPU memory and have less than 10GB of GPU RAM available, please make sure to load the StableDiffusionPipeline in float16 precision instead of the default float32 precision as done above." However, this gets me the RuntimeError: expected scalar type Half but found Float

This is the code:

import torch
from diffusers import StableDiffusionPipeline
YOUR_TOKEN="..."
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=YOUR_TOKEN)
pipe.to("cuda")
prompt = "a photograph of an astronaut riding a horse"
image = pipe(prompt)["sample"][0]
image.save(f"astronaut_rides_horse.png")

I googled this but did not found a solution, any hint welcome. Many thanks.

Got the same issue when running with cuda.

When running on the CPU, I get the following error:

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

deleted
This comment has been hidden

Not sure why you need another scheduler lms. For me this simple code runs on CPU fine:

import torch
from diffusers import StableDiffusionPipeline
prompt = "a painting of a man sitting down and having a cup of tea in his house by the beach, by greg rutkowski, muted colors "
YOUR_TOKEN="..."
device = "cpu"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=YOUR_TOKEN).to(device)
image = pipe(prompt)["sample"][0]
image.save(f"output.png")

However, just this single picture takes 6 minutes on my CPU. I would want to accelerate with GPU, so any ideas for solving the error "RuntimeError: expected scalar type Half but found Float" descriped above highly appreciated!

I got it working, but I cannot confirm since my gpu has too little VRAM (GTX 970)

C:\Users\USERNAME\AppData\Local\Programs\Python\Python38\Lib\site-packages\diffusers\models\resnet.py:

Change ".float()" to ".half() in rows 336 and 354
image.png

C:\Users\USERNAME\AppData\Local\Programs\Python\Python38\Lib\site-packages\diffusers\models\unet_2d_condition.py:

Add ".half()" to row 139

image.png

C:\Users\USERNAME\AppData\Local\Programs\Python\Python38\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py

Add ".half()" to row 137

image.png

Then I get OOM errors instead of torch errors

Hi all,

What worked for me was the context used in the CompVis Gradio app:

with autocast("cuda"):

But only when using a GPU. Think it solves the half/float tracking issues above.

Yes, that works now for me. Just for reference on a NVIDA RTX 2070 with 8 GB, but only up to a resolution of 512x512

@enzokro : Thank you!

deleted
This comment has been hidden
osanseviero changed discussion status to closed

Traceback (most recent call last):
File "D:\SD\stable-diffusion-webui\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "D:\SD\stable-diffusion-webui\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\modules\txt2img.py", line 109, in txt2img
processed = processing.process_images(p)
File "D:\SD\stable-diffusion-webui\modules\processing.py", line 845, in process_images
res = process_images_inner(p)
File "D:\SD\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 59, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "D:\SD\stable-diffusion-webui\modules\processing.py", line 959, in process_images_inner
p.setup_conds()
File "D:\SD\stable-diffusion-webui\modules\processing.py", line 1495, in setup_conds
super().setup_conds()
File "D:\SD\stable-diffusion-webui\modules\processing.py", line 506, in setup_conds
self.uc = self.get_conds_with_caching(prompt_parser.get_learned_conditioning, negative_prompts, total_steps, [self.cached_uc], self.extra_network_data)
File "D:\SD\stable-diffusion-webui\modules\processing.py", line 492, in get_conds_with_caching
cache[1] = function(shared.sd_model, required_prompts, steps, hires_steps, shared.opts.use_old_scheduling)
File "D:\SD\stable-diffusion-webui\modules\prompt_parser.py", line 188, in get_learned_conditioning
conds = model.get_learned_conditioning(texts)
File "D:\SD\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 669, in get_learned_conditioning
c = self.cond_stage_model(c)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\modules\sd_hijack_clip.py", line 234, in forward
z = self.process_tokens(tokens, multipliers)
File "D:\SD\stable-diffusion-webui\modules\sd_hijack_clip.py", line 276, in process_tokens
z = self.encode_with_transformers(tokens)
File "D:\SD\stable-diffusion-webui\modules\sd_hijack_clip.py", line 331, in encode_with_transformers
outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=-opts.CLIP_stop_at_last_layers)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1568, in _call_impl
result = forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 822, in forward
return self.text_model(
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 740, in forward
encoder_outputs = self.encoder(
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 654, in forward
layer_outputs = encoder_layer(
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 383, in forward
hidden_states, attn_weights = self.self_attn(
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\SD\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 322, in forward
attn_output = torch.bmm(attn_probs, value_states)
RuntimeError: expected scalar type Half but found Float

I always get this error when I'm using embeddings,

Sign up or log in to comment