TheBloke
/

koala-13B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Apr 8, 2023

Commit

b02e67a

·

1 Parent(s): dc3498e

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -20,6 +20,8 @@ This GPTQ model was quantized using [GPTQ-for-LLaMa](https://github.com/qwopqwop
 python3 llama.py /content/koala-13B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save /content/koala-13B-4bit-128g.pt
 ```
 ## How to run with text-generation-webui
 The model files provided will not load as-is with [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
@@ -37,9 +39,11 @@ ln -s GPTQ-for-LLaMa text-generation-webui/repositories/GPTQ-for-LLaMa
 Then install this model into `text-generation-webui/models` and run text-generation-webui as follows:
 ```
 cd text-generation-webui
-python server.py --model koala-13B-4bit-128g --wbits 4 --groupsize 128 --model_type Llama
 ```
 ## Coming soon
 Tomorrow I will upload a `safetensors` file as well.

 python3 llama.py /content/koala-13B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save /content/koala-13B-4bit-128g.pt
 ```
+I created this model using the latest Triton branch of GPTQ-for-LLaMa but I believe it can be run with the CUDA branch also.
 ## How to run with text-generation-webui
 The model files provided will not load as-is with [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
 Then install this model into `text-generation-webui/models` and run text-generation-webui as follows:
 ```
 cd text-generation-webui
+python server.py --model koala-13B-GPTQ-4bit-128g --wbits 4 --groupsize 128 --model_type Llama
 ```
+The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
 ## Coming soon
 Tomorrow I will upload a `safetensors` file as well.