Spaces:
Running
on
Zero
Running
on
Zero
"No available kernel. Aborting execution"
#5
by
highbyte
- opened
edit: I worked around the issue by manually editing the file at pip show samv2
(site-packages/samv2/utils/misc.pyand rewrite to always set
use_flash_attn = False`)
(note: cpu version works, it's the cuda update on this commit that now fails: https://huggingface.co/spaces/SkalskiP/florence-sam/commit/d1212b2598115fab86b6bc23c27b097e3856b8ef)
I couldn't get to install on windows, but on docker the gradio pops up until you try to run it.
I've tried so many things but couldn't find a solution.
I assume the error is related to these warnings:
UserWarning: Memory efficient kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:718.)
UserWarning: Memory Efficient attention has been runtime disabled. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:495.)
UserWarning: Flash attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:720.)
UserWarning: Expected query, key and value to all be of dtype: {Half, BFloat16}. Got Query dtype: float, Key dtype: float, and Value dtype: float instead. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:98.)
UserWarning: CuDNN attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:722.)
UserWarning: The CuDNN backend needs to be enabled by setting the enviornment variable`TORCH_CUDNN_SDPA_ENABLED=1` (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:496.)
full log:
Running on local URL: http://0.0.0.0:7860 | 0.00/1.36M [00:00<?, ?B/s]
To create a public link, set `share=True` in `launch()`.
UserWarning: Memory efficient kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:718.)
UserWarning: Memory Efficient attention has been runtime disabled. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:495.)
UserWarning: Flash attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:720.)
UserWarning: Expected query, key and value to all be of dtype: {Half, BFloat16}. Got Query dtype: float, Key dtype: float, and Value dtype: float instead. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:98.)
UserWarning: CuDNN attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:722.)
UserWarning: The CuDNN backend needs to be enabled by setting the enviornment variable`TORCH_CUDNN_SDPA_ENABLED=1` (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:496.)
Traceback (most recent call last):
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/route_utils.py", line 285, in call_process_api
output = await app.get_blocks().process_api(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
result = await self.call_function(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/blocks.py", line 1508, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/utils.py", line 818, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 83, in process
detections = run_sam_inference(SAM_MODEL, image_input, detections)
File "/home/user/app/utils/sam.py", line 30, in run_sam_inference
mask, score, _ = model.predict(box=detections.xyxy, multimask_output=False)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/sam2_image_predictor.py", line 269, in predict
masks, iou_predictions, low_res_masks = self._predict(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/sam2_image_predictor.py", line 398, in _predict
low_res_masks, iou_predictions, _, _ = self.model.sam_mask_decoder(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/mask_decoder.py", line 136, in forward
masks, iou_pred, mask_tokens_out, object_score_logits = self.predict_masks(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/mask_decoder.py", line 213, in predict_masks
hs, src = self.transformer(src, pos_src, tokens)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 100, in forward
queries, keys = layer(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 166, in forward
queries = self.self_attn(q=queries, k=queries, v=queries)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 254, in forward
out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
RuntimeError: No available kernel. Aborting execution.
UserWarning: Memory efficient kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:718.)
UserWarning: Memory Efficient attention has been runtime disabled. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:495.)
UserWarning: Flash attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:720.)
UserWarning: Expected query, key and value to all be of dtype: {Half, BFloat16}. Got Query dtype: float, Key dtype: float, and Value dtype: float instead. (Triggered internally at ../aten/src/ATen/native/transformers/sdp_utils_cpp.h:98.)
UserWarning: CuDNN attention kernel not used because: (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:722.)
UserWarning: The CuDNN backend needs to be enabled by setting the enviornment variable`TORCH_CUDNN_SDPA_ENABLED=1` (Triggered internally at ../aten/src/ATen/native/transformers/cuda/sdp_utils.cpp:496.)
Traceback (most recent call last):
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/route_utils.py", line 285, in call_process_api
output = await app.get_blocks().process_api(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
result = await self.call_function(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/blocks.py", line 1508, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/gradio/utils.py", line 818, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/app.py", line 83, in process
detections = run_sam_inference(SAM_MODEL, image_input, detections)
File "/home/user/app/utils/sam.py", line 30, in run_sam_inference
mask, score, _ = model.predict(box=detections.xyxy, multimask_output=False)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/sam2_image_predictor.py", line 269, in predict
masks, iou_predictions, low_res_masks = self._predict(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/sam2_image_predictor.py", line 398, in _predict
low_res_masks, iou_predictions, _, _ = self.model.sam_mask_decoder(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/mask_decoder.py", line 136, in forward
masks, iou_pred, mask_tokens_out, object_score_logits = self.predict_masks(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/mask_decoder.py", line 213, in predict_masks
hs, src = self.transformer(src, pos_src, tokens)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 100, in forward
queries, keys = layer(
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 166, in forward
queries = self.self_attn(q=queries, k=queries, v=queries)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/sam2/modeling/sam/transformer.py", line 254, in forward
out = F.scaled_dot_product_attention(q, k, v, dropout_p=dropout_p)
RuntimeError: No available kernel. Aborting execution.
You can use https://github.com/facebookresearch/segment-anything-2 to replace samv2, following Installation section install SAM-2 into your environment, and comment L239 device=DEVICE
of app.py.
inference_state = SAM_VIDEO_MODEL.init_state(
video_path=frame_directory_path,
# device=DEVICE
)