You are happy that @Meta has open-sourced Llama 3 ๐...
So you jump on @HuggingFace Hub to download the new shiny Llama 3 model only to see a few quintillion Llama 3's! ๐ฆโจ
Which one should you use? ๐ค
Not all Llamas are created equal! ๐ฆโ๏ธ
An absolutely crazy comparison experiment by Wolfram Ravenwolf (@Wolfram) might answer your question! ๐งช๐งโโ๏ธ
- Comprehensive assessment of Llama 3 Instruct 70B and 8B models. ๐ - Tested 20 versions across HF, GGUF, and EXL2 formats. ๐ - Methodology: The process tested translation capabilities and cross-language understanding, using deterministic generation settings to minimize random factors. Used German data protection training exams to evaluate cross-language understanding. ๐๐ - Best performance from EXL2 4.5bpw quant, scoring perfect in all tests. ๐โ - GGUF 8-bit to 4-bit quants also performed exceptionally. ๐ - Llama 3 8B unquantized is best in its size class but not as good as 70B quants. ๐๐ - 1-bit quantizations showed significant quality drops. โ ๏ธโฌ๏ธ