Safetensors
English
Russian
llama

Casting torch.bfloat16 to torch.float16.

#2
by nokados - opened

In the README, you recommend using --dtype half, which is equivalent to float16. However, in the config, you are using bfloat16. vLLM warns that it is casting torch.bfloat16 to torch.float16. Perhaps it would be better to use the original --dtype bfloat16?

Sign up or log in to comment