MM-ReMM-L2-20B-exl2 / README.md
R136a1's picture
Update README.md
b5e78cf
|
raw
history blame
694 Bytes
metadata
license: other
language:
  - en

EXL2 Quantization of Undi95's's MM-ReMM-L2-20B.

Model details

Quantized at 3.18bpw with hb 6, This one can actually go full 4K context on 16GB VRAM, will redo the other 20b models later.

Perplexity:

Base = 6.9504

3.18 h6 = 7.0138

Dataset = wikitext

Prompt Format

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response: