File size: 390 Bytes
8b4e73b
 
 
c61ae37
 
 
 
 
1
2
3
4
5
6
7
8
---
license: other
---
2.5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.

Updated as 21 of September 2023, which should fix the bad ppl results.

I suggest, if using Ubuntu, to use it with flash-attn. It reduces VRAM usage by a good margin, and is specially useful for this case (70B model on a single 24GB VRAM GPU)