Unable to use on free tier Google Colab
#2
by
sudhir2016
- opened
Tried using on free tier Colab with int8 quantization using Quanto. Model loads but runs out of RAM on inference. Then tried int4 quantization. At inference it just keeps running endlessly. Waited for 20 minutes then gave up.