Quantization request
#1
by
dillfrescott
- opened
Yes please, thank you
would be awesome to have this on AWQ for fast inference on serving engines
It's difficult right now, but I'll try.
Thanks!
Any word on quants or exl2 that can run this on a 24gb card? Would love to run this locally.