Did you use AutoGPTQ to quantize the model or something else? I'm trying to replicate this myself but AutoGPTQ doesn't support cohere
Please see #1
· Sign up or log in to comment