Sparse-Llama-3.1-2of4 Collection 2:4 sparse versions of Llama-3.1, including transfer learning • 10 items • Updated Dec 18, 2024 • 4
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Paper • 2405.03594 • Published May 6, 2024 • 7
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 47
Llama-3.2 Quantization Collection Llama 3.2 models quantized by Neural Magic • 9 items • Updated Sep 26, 2024 • 9
Llama-3.1 Quantization Collection Neural Magic quantized Llama-3.1 models • 22 items • Updated Nov 22, 2024 • 43
INT8 LLMs for vLLM Collection Accurate INT8 quantized models by Neural Magic, ready for use with vLLM! • 50 items • Updated Sep 26, 2024 • 14