Update README.md

2a9bc26 over 1 year ago

358 Bytes

Model upload in 4-bit GPTQ version, converted using GPTQ-for-LLaMa; Source model from https://huggingface.co/Peeepy/Airoboros-13b-SuperHOT-8k.

You will need a monkey-patch at inference to use the 8k context, please see patch file present, if you are using a different inference engine (like llama.cpp / exllama) you will need to add the monkey patch there.