Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,8 @@
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
-
2.5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
+
2.5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.
|
5 |
+
|
6 |
+
Updated as 21 of September 2023, which should fix the bad ppl results.
|
7 |
+
|
8 |
+
I suggest, if using Ubuntu, to use it with flash-attn. It reduces VRAM usage by a good margin, and is specially useful for this case (70B model on a single 24GB VRAM GPU)
|