OSError: LoneStriker/Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2 does not appear to have a file named model-00001-of-00013.safetensors.
Hello. I'm new here
I downloaded the original Mixtral in cloudyu/Mixtral_34Bx2_MoE_60B before, but my vram is not enough(4090 24G).
so i was trying to use this model but it pop out this error message.
Here's the code i use:
Mixtral_model_name = "LoneStriker/Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2"
model = AutoModelForCausalLM.from_pretrained(Mixtral_model_name,
torch_dtype=torch.float32,
device_map='cuda',
local_files_only=False,
# load_in_4bit=True
)
How can i fix this?
And how many vram will this model cost?
Thanks!
Use either oobabooga's text-generation-webui or exui from Github to load this model. For ooba, use the exllamav2
loader.
@LoneStriker
Thansk! oobabooga's text-generation-webui works, but still out of vram.
How many vram does this model cost? It seems >24G
It should fit in 24 GB VRAM, you probably need to reduce the max tokens in ooba, here is the model size for 2.4 and 2.65:
18G Mixtral_34Bx2_MoE_60B-2.4bpw-h6-exl2
20G Mixtral_34Bx2_MoE_60B-2.65bpw-h6-exl2
Drop your max tokens to 2048 to see if it loads.