flashvenom's picture
Update README.md
2a9bc26
|
raw
history blame
358 Bytes

Model upload in 4-bit GPTQ version, converted using GPTQ-for-LLaMa; Source model from https://huggingface.co/Peeepy/Airoboros-13b-SuperHOT-8k.

You will need a monkey-patch at inference to use the 8k context, please see patch file present, if you are using a different inference engine (like llama.cpp / exllama) you will need to add the monkey patch there.