Add support for AQLM
#1
by
BlackSamorez
- opened
AQLM is a SOTA 2-bit LLM quantization algorithm, that shows incredible precision for its compression ratio. It's fully integrated with transformers and there are quite a few models prequantized.
Adding it to the leaderboard would shed light at what 2-bit quantization is really capable of.
hi, @BlackSamorez , we will support AQLM as soon as possible! Thanks~
@BlackSamorez please kindly consider to compare your method with AutoRound which have already shown remarkable results at W2G128 and W2G32, as presented in https://github.com/intel/auto-round/blob/main/docs/acc.md, without introducing any extra overhead at inference,
hi @BlackSamorez we add AQLM, we evaluate 2 models now and we will add more models results.
BlackSamorez
changed discussion status to
closed