Help. Getting Gibberish Responses?
I assume you are using ooba?
Try updating it in case you have an older version
changing the preset in the parameters tab or in sillytavern or whatever frontend you use, maybe the temp or something is set way to high?
I don't know why, but this exl2 gives me much worse responses than GGUF Q5_K_M and doesn't follow guidelines. Instead of role-playing, it turns out to be story-telling... 🤔
I loaded it up and it is coherent for me using an up to date ooba, I do not get any runaway generation either.
If you are getting actual gibberish I would make sure your backend, ooba/tabby or whatever you are using is up to date, you could also check your sampler settings, shortwave usually works fine for me.
Regarding the context, are you trying to run 32k context? if so that is probably the issue since I do not think that exllama handles sliding window attention, try 8k instead.