leafspark commited on
Commit
2539e2c
1 Parent(s): 7ad14eb

readme: daily update

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -84,9 +84,10 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
84
  | Quant | Status | Size | Description | KV Metadata | Weighted | Notes |
85
  |----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
86
  | BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
87
- | Q8_0 | Uploading | 233.27 GB | High quality *recommended* | Updated | Yes | |
 
88
  | Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
89
- | Q3_K_M | Uploading | 92.6 GB | Medium-low quality | Updated | Yes | |
90
  | IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
91
  | Q2_K | Available | 80.0 GB | Low quality **not recommended** | Old | No | |
92
  | IQ2_XXS | Available | 61.5 GB | Lower quality **not recommended** | Old | Yes | |
@@ -97,8 +98,9 @@ Note: Use iMatrix quants only if you can fully offload to GPU, otherwise speed w
97
 
98
  | Planned Quant | Notes |
99
  |-------------------|---------|
100
- | Q5_K_M | |
101
- | Q5_K_M | |
 
102
  | Q6_K | |
103
  | IQ4_XS | |
104
  | IQ2_XS | |
@@ -116,9 +118,9 @@ deepseek2.leading_dense_block_count=int:1
116
  deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
117
  ```
118
 
119
- The `Q8_0` quant contains these parameters, along with future ones, so as long as you're running a supported build of llama.cpp no `--override-kv` parameters are required.
120
 
121
- A precompiled AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c5063c46a52da3e01.zip` in the root of this repo.
122
 
123
  # License:
124
  - DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
@@ -128,7 +130,7 @@ A precompiled AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c50
128
  *~1.5t/s* with Ryzen 3 3700x (96gb 3200mhz) `[Q2_K]`
129
 
130
  # iMatrix:
131
- Find `imatrix.dat` in the root of this repo, made with a `Q2_K` quant (see here for info: [https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693](https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693))
132
 
133
  Using `groups_merged.txt`, find it here: [https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
134
 
 
84
  | Quant | Status | Size | Description | KV Metadata | Weighted | Notes |
85
  |----------|-------------|-----------|--------------------------------------------|-------------|----------|-------|
86
  | BF16 | Available | 439 GB | Lossless :) | Old | No | Q8_0 is sufficient for most cases |
87
+ | Q8_0 | Available | 233.27 GB | High quality *recommended* | Updated | Yes | |
88
+ | Q5_K_M | Uploading | 155 GB | Medium-low quality | Updated | Yes | |
89
  | Q4_K_M | Available | 132 GB | Medium quality *recommended* | Old | No | |
90
+ | Q3_K_M | Available | 104 GB | Medium-low quality | Updated | Yes | |
91
  | IQ3_XS | Available | 89.6 GB | Better than Q3_K_M | Old | Yes | |
92
  | Q2_K | Available | 80.0 GB | Low quality **not recommended** | Old | No | |
93
  | IQ2_XXS | Available | 61.5 GB | Lower quality **not recommended** | Old | Yes | |
 
98
 
99
  | Planned Quant | Notes |
100
  |-------------------|---------|
101
+ | Q5_K_S | |
102
+ | Q4_K_S | |
103
+ | Q3_K_S | |
104
  | Q6_K | |
105
  | IQ4_XS | |
106
  | IQ2_XS | |
 
118
  deepseek2.rope.scaling.yarn_log_multiplier=float:0.0707
119
  ```
120
 
121
+ Quants with "Updated" metadata contain these parameters, so as long as you're running a supported build of llama.cpp no `--override-kv` parameters are required.
122
 
123
+ A precompiled Windows AVX2 version is avaliable at `llama.cpp-039896407afd40e54321d47c5063c46a52da3e01.zip` in the root of this repo.
124
 
125
  # License:
126
  - DeepSeek license for model weights, which can be found in the `LICENSE` file in the root of this repo
 
130
  *~1.5t/s* with Ryzen 3 3700x (96gb 3200mhz) `[Q2_K]`
131
 
132
  # iMatrix:
133
+ Find `imatrix.dat` in the root of this repo, made with a `Q2_K` quant containing 62 chunks (see here for info: [https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693](https://github.com/ggerganov/llama.cpp/issues/5153#issuecomment-1913185693))
134
 
135
  Using `groups_merged.txt`, find it here: [https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
136