Update README.md
Browse files
README.md
CHANGED
@@ -20,6 +20,8 @@ This GPTQ model was quantized using [GPTQ-for-LLaMa](https://github.com/qwopqwop
|
|
20 |
python3 llama.py /content/koala-13B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save /content/koala-13B-4bit-128g.pt
|
21 |
```
|
22 |
|
|
|
|
|
23 |
## How to run with text-generation-webui
|
24 |
|
25 |
The model files provided will not load as-is with [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
@@ -37,9 +39,11 @@ ln -s GPTQ-for-LLaMa text-generation-webui/repositories/GPTQ-for-LLaMa
|
|
37 |
Then install this model into `text-generation-webui/models` and run text-generation-webui as follows:
|
38 |
```
|
39 |
cd text-generation-webui
|
40 |
-
python server.py --model koala-13B-4bit-128g --wbits 4 --groupsize 128 --model_type Llama
|
41 |
```
|
42 |
|
|
|
|
|
43 |
## Coming soon
|
44 |
|
45 |
Tomorrow I will upload a `safetensors` file as well.
|
|
|
20 |
python3 llama.py /content/koala-13B-HF c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save /content/koala-13B-4bit-128g.pt
|
21 |
```
|
22 |
|
23 |
+
I created this model using the latest Triton branch of GPTQ-for-LLaMa but I believe it can be run with the CUDA branch also.
|
24 |
+
|
25 |
## How to run with text-generation-webui
|
26 |
|
27 |
The model files provided will not load as-is with [oobaboogas text-generation-webui](https://github.com/oobabooga/text-generation-webui).
|
|
|
39 |
Then install this model into `text-generation-webui/models` and run text-generation-webui as follows:
|
40 |
```
|
41 |
cd text-generation-webui
|
42 |
+
python server.py --model koala-13B-GPTQ-4bit-128g --wbits 4 --groupsize 128 --model_type Llama
|
43 |
```
|
44 |
|
45 |
+
The above commands assume you have installed all dependencies for GPTQ-for-LLaMa and text-generation-webui. Please see their respective repositories for further information.
|
46 |
+
|
47 |
## Coming soon
|
48 |
|
49 |
Tomorrow I will upload a `safetensors` file as well.
|