leafspark
/

DeepSeek-V2-Chat-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

leafspark commited on May 18

Commit

9284f10

•

1 Parent(s): 15b6d7d

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -27,4 +27,6 @@ Using llama.cpp fork: [https://github.com/fairydreaming/llama.cpp/tree/deepseek-
 - ~~q2_k (after q4_k_m) [estimated size: ~65gb]~~
 - ~~q3_k_s (low priority) [estimated size: 96.05gb]~~
-If quantize.exe supports it I will make RTN quants (edit: it doesn't).

 - ~~q2_k (after q4_k_m) [estimated size: ~65gb]~~
 - ~~q3_k_s (low priority) [estimated size: 96.05gb]~~
+If quantize.exe supports it I will make RTN quants (edit: it doesn't, will try building from fork).
+Note: the bf16 GGUF does not have some DeepSeek v2 specific parameters, will look into adding them