Llama 3 8B Instruct - Q8 vs FP16 vs FP32

by Hearcharted - opened Jun 25

Jun 25

Llama 3 8B Instruct - Q8 vs FP16 vs FP32

Hi Bartowski, do you know if there is too much difference in Response Quality between them?
Many thanks in advance...

Owner Jun 27

There shouldn't be that much difference, but any quantization will affect output to some level..

If you can comfortably run f16, you should

Jun 29

Thank you so much for your time, gentleman 🎩

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment