Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,17 @@
|
|
2 |
base_model: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
|
3 |
license: apache-2.0
|
4 |
---
|
5 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
|
8 |
# Importance-Matrix quantizations of HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
|
@@ -52,4 +62,6 @@ Final estimate: PPL = 4.0884 +/- 0.1499
|
|
52 |
orpo3ns.gguf 58536MB
|
53 |
[1]2.8042,[2]3.3418,[3]3.9400,[4]3.5859,[5]3.2042,[6]3.0524,[7]2.9738,[8]2.9695,[9]3.0232,[10]3.0099,[11]3.0510,[12]3.0589,
|
54 |
Final estimate: PPL = 3.0589 +/- 0.09882
|
55 |
-
```
|
|
|
|
|
|
2 |
base_model: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
|
3 |
license: apache-2.0
|
4 |
---
|
5 |
+
## Alex, download the 3bit orpo3ns.gguf.part0 & orpo3ns.gguf.part1 files then:
|
6 |
+
```
|
7 |
+
cd ~/Downloads
|
8 |
+
|
9 |
+
cat orpo3ns.gguf.part* > orpo3ns.gguf && rm -rf orpo3ns.gguf.part*
|
10 |
+
|
11 |
+
```
|
12 |
+
|
13 |
+
For lmStudio you need to copy the full orpo3ns.gguf file to your ~/.cache/lm-studio/models/YourNAME/
|
14 |
+
|
15 |
+
## orpo4ns.gguf is good to go, 2bit also done but not recommended.
|
16 |
|
17 |
|
18 |
# Importance-Matrix quantizations of HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
|
|
|
62 |
orpo3ns.gguf 58536MB
|
63 |
[1]2.8042,[2]3.3418,[3]3.9400,[4]3.5859,[5]3.2042,[6]3.0524,[7]2.9738,[8]2.9695,[9]3.0232,[10]3.0099,[11]3.0510,[12]3.0589,
|
64 |
Final estimate: PPL = 3.0589 +/- 0.09882
|
65 |
+
```
|
66 |
+
|
67 |
+
# The 3bit version is surprisingly usable even though only 58GB
|