Commit
·
b546c83
1
Parent(s):
9d97091
Expand Repositories section to add links to a couple more versions
Browse files
README.md
CHANGED
@@ -43,7 +43,9 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
|
|
43 |
|
44 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GPTQ)
|
45 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GGML)
|
46 |
-
* [
|
|
|
|
|
47 |
|
48 |
## Prompt template: Human-Response
|
49 |
|
|
|
43 |
|
44 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GPTQ)
|
45 |
* [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GGML)
|
46 |
+
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference, plus fp16 GGUF for requantizing](https://huggingface.co/TheBloke/YokaiKoibito/WizardLM-Uncensored-Falcon-40B-GGUF)
|
47 |
+
* [Jarrad Hope's unquantised model in fp16 pytorch format, for GPU inference and further conversions](https://huggingface.co/YokaiKoibito/llama2_70b_chat_uncensored-fp16)
|
48 |
+
* [Jarrad Hope's original unquantised fp32 model in pytorch format, for further conversions](https://huggingface.co/jarradh/llama2_70b_chat_uncensored)
|
49 |
|
50 |
## Prompt template: Human-Response
|
51 |
|