Update README.md
Browse files
README.md
CHANGED
@@ -45,18 +45,18 @@ Now that we have ExLlama, that is the recommended loader to use for these models
|
|
45 |
|
46 |
Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
|
47 |
|
48 |
-
|
49 |
## AutoGPTQ and GPTQ-for-LLaMa requires latest version of Transformers
|
50 |
|
51 |
-
If you plan to use any of these quants with AutoGPTQ or GPTQ-for-LLaMa,
|
|
|
|
|
|
|
|
|
52 |
|
53 |
```
|
54 |
pip3 install git+https://github.com/huggingface/transformers
|
55 |
```
|
56 |
|
57 |
-
If using a UI like text-generation-webui, make sure to do this in the Python environment of text-generation-webui.
|
58 |
-
|
59 |
-
|
60 |
## Repositories available
|
61 |
|
62 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ)
|
|
|
45 |
|
46 |
Reminder: ExLlama does not support 3-bit models, so if you wish to try those quants, you will need to use AutoGPTQ or GPTQ-for-LLaMa.
|
47 |
|
|
|
48 |
## AutoGPTQ and GPTQ-for-LLaMa requires latest version of Transformers
|
49 |
|
50 |
+
If you plan to use any of these quants with AutoGPTQ or GPTQ-for-LLaMa, your Transformers needs to be be using the latest Github code.
|
51 |
+
|
52 |
+
If you're using text-generation-webui and have updated to the latest version, this is done for you automatically.
|
53 |
+
|
54 |
+
If not, you can update it manually with:
|
55 |
|
56 |
```
|
57 |
pip3 install git+https://github.com/huggingface/transformers
|
58 |
```
|
59 |
|
|
|
|
|
|
|
60 |
## Repositories available
|
61 |
|
62 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ)
|