TheBloke commited on
Commit
f352f2b
1 Parent(s): e9e7d2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -26,11 +26,11 @@ If you're using a text-generation-webui one click installer, you MUST use `wizar
26
 
27
  ## Provided files
28
 
29
- Two files are provided. **The second file will not work unless you use a recent version of the Triton branch of GPTQ-for-LLaMa**
30
 
31
  Specifically, the second file uses `--act-order` for maximum quantisation quality and will not work with oobabooga's fork of GPTQ-for-LLaMa. Therefore at this time it will also not work with `text-generation-webui` one-click installers.
32
 
33
- Unless you are able to use the latest Triton GPTQ-for-LLaMa code, please use `medalpaca-13B-GPTQ-4bit-128g.no-act-order.safetensors`
34
 
35
  * `wizardLM-7B-GPTQ-4bit-128g.no-act-order.safetensors`
36
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
@@ -45,7 +45,7 @@ Unless you are able to use the latest Triton GPTQ-for-LLaMa code, please use `me
45
  * Only works with recent GPTQ-for-LLaMa code
46
  * **Does not** work with text-generation-webui one-click-installers
47
  * Parameters: Groupsize = 128g. act-order.
48
- * Offers highest quality quantisation, but requires recent Triton GPTQ-for-LLaMa code and more VRAM
49
  * Command used to create the GPTQ:
50
  ```
51
  CUDA_VISIBLE_DEVICES=0 python3 llama.py wizardLM-7B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors wizardLM-7B-GPTQ-4bit-128g.act-order.safetensors
@@ -57,7 +57,7 @@ File `wizardLM-7B-GPTQ-4bit-128g.no-act-order.safetensors` can be loaded the sam
57
 
58
  [Instructions on using GPTQ 4bit files in text-generation-webui are here](https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-\(4-bit-mode\)).
59
 
60
- The other two `safetensors` model files were created using `--act-order` to give the maximum possible quantisation quality, but this means it requires that the latest Triton GPTQ-for-LLaMa is used inside the UI.
61
 
62
  If you want to use the act-order `safetensors` files and need to update the Triton branch of GPTQ-for-LLaMa, here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
63
  ```
 
26
 
27
  ## Provided files
28
 
29
+ Two files are provided. **The second file will not work unless you use a recent version of GPTQ-for-LLaMa**
30
 
31
  Specifically, the second file uses `--act-order` for maximum quantisation quality and will not work with oobabooga's fork of GPTQ-for-LLaMa. Therefore at this time it will also not work with `text-generation-webui` one-click installers.
32
 
33
+ Unless you are able to use the latest GPTQ-for-LLaMa code, please use `wizardLM-7B-GPTQ-4bit-128g.no-act-order.safetensors`.
34
 
35
  * `wizardLM-7B-GPTQ-4bit-128g.no-act-order.safetensors`
36
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
 
45
  * Only works with recent GPTQ-for-LLaMa code
46
  * **Does not** work with text-generation-webui one-click-installers
47
  * Parameters: Groupsize = 128g. act-order.
48
+ * Offers highest quality quantisation, but requires recent GPTQ-for-LLaMa code
49
  * Command used to create the GPTQ:
50
  ```
51
  CUDA_VISIBLE_DEVICES=0 python3 llama.py wizardLM-7B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors wizardLM-7B-GPTQ-4bit-128g.act-order.safetensors
 
57
 
58
  [Instructions on using GPTQ 4bit files in text-generation-webui are here](https://github.com/oobabooga/text-generation-webui/wiki/GPTQ-models-\(4-bit-mode\)).
59
 
60
+ The other `safetensors` model file was created using `--act-order` to give the maximum possible quantisation quality, but this means it requires that the latest GPTQ-for-LLaMa is used inside the UI.
61
 
62
  If you want to use the act-order `safetensors` files and need to update the Triton branch of GPTQ-for-LLaMa, here are the commands I used to clone the Triton branch of GPTQ-for-LLaMa, clone text-generation-webui, and install GPTQ into the UI:
63
  ```