Llama 3 8B Instruct - Q8 vs FP16 vs FP32
#8
by
Hearcharted
- opened
Llama 3 8B Instruct - Q8 vs FP16 vs FP32
Hi Bartowski, do you know if there is too much difference in Response Quality between them?
Many thanks in advance...
There shouldn't be that much difference, but any quantization will affect output to some level..
If you can comfortably run f16, you should
Thank you so much for your time, gentleman 🎩