PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Paper • 2410.05265 • Published Oct 7 • 29
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models Paper • 2310.04564 • Published Oct 6, 2023 • 2