Question

#1
by NotKelenic - opened

Does anyone know how to make this model not using all the tokens it have on the response? I set around of 500 max_tokens. It uses all of them and stucks in mid-word.

I recommend you set less or more max tokens, number of which must be a multiple of 8. e.g. 256, 512, 1024 it is not necessary, but often works. I use 256 for usual RP, 320 for text adventure and 512 for very detailed ERP. Also, context formatting and context template might do something here. Try to use similar to provided screenshot. With all these settings, this model still can (while rarely) write some extra text, but "Trim incomplete" will do it's magic, making your experience better.
image.png

OddTheGreat changed discussion status to closed

Sign up or log in to comment