RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
#1
by
LaferriereJC
- opened
trying to run on cpu
model = AutoGPTQForCausalLM.from_quantized(
model_repo,
device=device,
use_safetensors=True,
use_triton=device != "cpu", # comment/remove if not on Linux
).to(device).to(torch.float32)
"""
model = AutoGPTQForCausalLM.from_quantized(
model_repo,
device=device,
use_safetensors=True,
use_triton=device != "cpu", # comment/remove if not on Linux
).to(device)
"""
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'