Text Generation
GGUF
Japanese
mistral
mixtral
Merge
Mixture of Experts
Not-For-All-Audiences
nsfw
Inference Endpoints
Update README.md
Browse files
README.md
CHANGED
@@ -22,3 +22,17 @@ python ./llama.cpp/convert.py ./chatntq_chatvector-MoE-Antler_chatvector-2x7B --
|
|
22 |
./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
|
23 |
./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
|
24 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
|
23 |
./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./chatntq_chatvector-MoE-Antler_chatvector-2x7B_iq3xxs.gguf iq3_xxs
|
24 |
```
|
25 |
+
|
26 |
+
## 環境
|
27 |
+
|
28 |
+
- CPU: Ryzen 5 5600X
|
29 |
+
- GPU: GeForce RTX 3060 12GB
|
30 |
+
- RAM: DDR4-3200 96GB
|
31 |
+
- OS: Windows 10
|
32 |
+
- software: Python 3.12.2、[KoboldCpp](https://github.com/LostRuins/koboldcpp) v1.61.2
|
33 |
+
|
34 |
+
#### KoboldCppの設定
|
35 |
+
|
36 |
+
(デフォルトから変更したもののみ記載)
|
37 |
+
- `GPU Layers: 33` (33以上でフルロード)
|
38 |
+
- `Context Size: 32768`
|