alyssavance's picture
Create README.md
904ef09 verified

3-bit HQQ quantized version of Meta-Llama-3.1-405B (base version). Quality will be degraded some, but should still be usable. Quantization parameters:

nbits=3, group_size=128, quant_zero=True, quant_scale=True, axis=0

Shards have been split with "split", to recombine:

cat qmodel_shard* > qmodel.pt