Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ Please note this is an experimental GPTQ model. Support for it is currently quit
|
|
39 |
|
40 |
It is also expected to be **VERY SLOW**. This is currently unavoidable, but is being looked at.
|
41 |
|
42 |
-
This
|
43 |
|
44 |
Please be aware that you should currently expect around 0.7 tokens/s on 40B Falcon GPTQ.
|
45 |
|
|
|
39 |
|
40 |
It is also expected to be **VERY SLOW**. This is currently unavoidable, but is being looked at.
|
41 |
|
42 |
+
This 4bit model requires at least 35GB VRAM to load. It can be used on 40GB or 48GB cards, but not less.
|
43 |
|
44 |
Please be aware that you should currently expect around 0.7 tokens/s on 40B Falcon GPTQ.
|
45 |
|