Sdff-Ltba
/

LightChatAssistant-2x7B-GGUF

Text Generation

Mixture of Experts

Not-For-All-Audiences

nsfw

Inference Endpoints

Model card Files Files and versions Community

Sdff-Ltba commited on Apr 4, 2024

Commit

60d0868

·

verified ·

1 Parent(s): 92946e8

Update README.md

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -22,3 +22,17 @@ python ./llama.cpp/convert.py ./chatntq_chatvector-MoE-Antler_chatvector-2x7B --
 ./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
 ./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
 ```

 ./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
 ./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
 ```
+## 環境
+- CPU: Ryzen 5 5600X
+- GPU: GeForce RTX 3060 12GB
+- RAM: DDR4-3200 96GB
+- OS: Windows 10
+- software: Python 3.12.2、[KoboldCpp](https://github.com/LostRuins/koboldcpp) v1.61.2
+#### KoboldCppの設定
+(デフォルトから変更したもののみ記載)
+- `GPU Layers: 33` (33以上でフルロード)
+- `Context Size: 32768`