Text Generation
Transformers
Safetensors
llama
Not-For-All-Audiences
nsfw
text-generation-inference
Inference Endpoints
AzureBlack
commited on
Commit
•
70386be
1
Parent(s):
fca2891
Update README.md
Browse files
README.md
CHANGED
@@ -7,17 +7,17 @@ tags:
|
|
7 |
|
8 |
ExllamaV2 version of the model created by [Undi](https://huggingface.co/Undi95)!
|
9 |
|
10 |
-
Original Model https://huggingface.co/Undi95/Dawn-v2-70B
|
11 |
|
12 |
Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.
|
13 |
|
14 |
Main branch is 4.6bpw 8h (req ??gb)
|
15 |
|
16 |
-
2.5b8h branch is 2.
|
17 |
|
18 |
5.0b8h branch is 5.0bpw 8h (req ??gb)
|
19 |
|
20 |
-
6b8h branch is
|
21 |
|
22 |
|
23 |
|
|
|
7 |
|
8 |
ExllamaV2 version of the model created by [Undi](https://huggingface.co/Undi95)!
|
9 |
|
10 |
+
Original Model https://huggingface.co/Undi95/Dawn-v2-70B
|
11 |
|
12 |
Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.
|
13 |
|
14 |
Main branch is 4.6bpw 8h (req ??gb)
|
15 |
|
16 |
+
2.5b8h branch is 2.5bpw 8h (req 24gb and the 8b cache setting) - Add BOS token must be unchecked at this weight or output is nonsense. New quant method applied 12/17/2023
|
17 |
|
18 |
5.0b8h branch is 5.0bpw 8h (req ??gb)
|
19 |
|
20 |
+
6b8h branch is 6.0bpw 8h requires between 60-72gb
|
21 |
|
22 |
|
23 |
|