OOM with int4 quant
#8
by
chungimungi
- opened
OOM with 32GB vram on int4 quant, considering this is only a 16B para model it should run on the GPU. I can easily run gemma-2 27B on int4 with the same specs.
OOM with 32GB vram on int4 quant, considering this is only a 16B para model it should run on the GPU. I can easily run gemma-2 27B on int4 with the same specs.