TheBloke commited on
Commit
d99179a
·
1 Parent(s): fbbc632

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -30,7 +30,9 @@ Support is also expected to come to llama.cpp, however work is still being done
30
 
31
  To use the increased context with KoboldCpp, use `--contextsize` to set the desired context, eg `--contextsize 4096` or `--contextsize 8192` or `--contextsize 16384`.
32
 
33
- **NOTE**: Increased context length is an area seeing rapid developments and improvements. It is quite possible that these models may be superseded by new developments in the coming days. If that's the case, I will remove them, or update this README as appropriate.
 
 
34
 
35
  ## Repositories available
36
 
@@ -87,9 +89,11 @@ Refer to the Provided Files table below to see what files use which methods, and
87
  On Linux I use the following command line to launch the KoboldCpp UI with OpenCL aceleration and a context size of 4096:
88
 
89
  ```
90
- python ./koboldcpp.py --stream --unbantokens --threads 8 --usecublas --gpulayers 100 longchat-13b-16k.ggmlv3.q4_K_M.bin
91
  ```
92
 
 
 
93
  Change `--gpulayers 100` to the number of layers you want/are able to offload to the GPU. Remove it if you don't have GPU acceleration.
94
 
95
  For OpenCL acceleration, change `--usecublas` to `--useclblast 0 0`. You may need to change the second `0` to `1` if you have both an iGPU and a discrete GPU.
 
30
 
31
  To use the increased context with KoboldCpp, use `--contextsize` to set the desired context, eg `--contextsize 4096` or `--contextsize 8192` or `--contextsize 16384`.
32
 
33
+ **NOTE 1**: Currently RoPE models can _only_ be used at a context size greater than 2048. At 2048 it will produce gibberish. Please make sure you're always setting `--contextsize` and specifying a value higher than 2048, eg 3072, 4096, etc.
34
+
35
+ **NOTE 2**: Increased context length is an area seeing rapid developments and improvements. It is quite possible that these models may be superseded by new developments in the coming days. If that's the case, I will remove them, or update this README as appropriate.
36
 
37
  ## Repositories available
38
 
 
89
  On Linux I use the following command line to launch the KoboldCpp UI with OpenCL aceleration and a context size of 4096:
90
 
91
  ```
92
+ python ./koboldcpp.py --contextsize 4096 --stream --unbantokens --threads 8 --usecublas --gpulayers 100 longchat-13b-16k.ggmlv3.q4_K_M.bin
93
  ```
94
 
95
+ Change `--contextsize` to the context size you want - **it must be higher than 2048 else the model will produce gibberish**
96
+
97
  Change `--gpulayers 100` to the number of layers you want/are able to offload to the GPU. Remove it if you don't have GPU acceleration.
98
 
99
  For OpenCL acceleration, change `--usecublas` to `--useclblast 0 0`. You may need to change the second `0` to `1` if you have both an iGPU and a discrete GPU.