Loading model in 8bit

#37

by abhi24 - opened Apr 18, 2023

Apr 18, 2023

Does loading the model in 8 bit always lead to a poor quality of performance? at least poorer compared to original model?
Can someone describe in brief what happens when we load it 8 bit?

srowen

Databricks org Apr 18, 2023

Instead of working with 16-bit floating-point numbers for weights, you work with 8-bit integers. These have much smaller range and precision, so the math is less accurate where done in 8-bit. It's not necessarily faster either. But it takes half the memory. It doesn't necessarily make the result much worse; some have experimented even with 4-bit math. For example, the Dolly 12B model works on an A10 in 8-bit and the results seem pretty fine to me.

abhi24

Apr 18, 2023

Thank you for the insightful reply.

abhi24 changed discussion status to closed Apr 18, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment