nisten commited on
Commit
6875282
·
verified ·
1 Parent(s): 491f30f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -2
README.md CHANGED
@@ -2,7 +2,17 @@
2
  base_model: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
3
  license: apache-2.0
4
  ---
5
- ## orpo4ns.gguf is good to go, 2bit also done but not recommended, other quants STILL UPLOADING.
 
 
 
 
 
 
 
 
 
 
6
 
7
 
8
  # Importance-Matrix quantizations of HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
@@ -52,4 +62,6 @@ Final estimate: PPL = 4.0884 +/- 0.1499
52
  orpo3ns.gguf 58536MB
53
  [1]2.8042,[2]3.3418,[3]3.9400,[4]3.5859,[5]3.2042,[6]3.0524,[7]2.9738,[8]2.9695,[9]3.0232,[10]3.0099,[11]3.0510,[12]3.0589,
54
  Final estimate: PPL = 3.0589 +/- 0.09882
55
- ```
 
 
 
2
  base_model: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
3
  license: apache-2.0
4
  ---
5
+ ## Alex, download the 3bit orpo3ns.gguf.part0 & orpo3ns.gguf.part1 files then:
6
+ ```
7
+ cd ~/Downloads
8
+
9
+ cat orpo3ns.gguf.part* > orpo3ns.gguf && rm -rf orpo3ns.gguf.part*
10
+
11
+ ```
12
+
13
+ For lmStudio you need to copy the full orpo3ns.gguf file to your ~/.cache/lm-studio/models/YourNAME/
14
+
15
+ ## orpo4ns.gguf is good to go, 2bit also done but not recommended.
16
 
17
 
18
  # Importance-Matrix quantizations of HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
 
62
  orpo3ns.gguf 58536MB
63
  [1]2.8042,[2]3.3418,[3]3.9400,[4]3.5859,[5]3.2042,[6]3.0524,[7]2.9738,[8]2.9695,[9]3.0232,[10]3.0099,[11]3.0510,[12]3.0589,
64
  Final estimate: PPL = 3.0589 +/- 0.09882
65
+ ```
66
+
67
+ # The 3bit version is surprisingly usable even though only 58GB