TheBloke
/

falcon-40b-instruct-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Jun 15, 2023

Commit

3b7afc7

•

1 Parent(s): 71e3419

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -121,15 +121,14 @@ print(tokenizer.decode(output[0]))
 **gptq_model-4bit--1g.safetensors**
-This will work with AutoGPTQ as of commit `3cb1bf5` (`3cb1bf5a6d43a06dc34c6442287965d1838303d3`)
 It was created without groupsize to reduce VRAM requirements, and with `desc_act` (act-order) to improve inference quality.
 * `gptq_model-4bit--1g.safetensors`
-  * Works only with latest AutoGPTQ CUDA, compiled from source as of commit `3cb1bf5`
     * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
-  * Works with text-generation-webui using `--autogptq --trust_remote_code`
-    * At this time it does NOT work with one-click-installers
   * Does not work with any version of GPTQ-for-LLaMa
   * Parameters: Groupsize = None. Act order (desc_act)

 **gptq_model-4bit--1g.safetensors**
+This will work with AutoGPTQ 0.2.0 and later.
 It was created without groupsize to reduce VRAM requirements, and with `desc_act` (act-order) to improve inference quality.
 * `gptq_model-4bit--1g.safetensors`
+  * Works AutoGPTQ 0.2.0 and later.
     * At this time it does not work with AutoGPTQ Triton, but support will hopefully be added in time.
+  * Works with text-generation-webui using `--trust-remote-code`
   * Does not work with any version of GPTQ-for-LLaMa
   * Parameters: Groupsize = None. Act order (desc_act)