mobiuslabsgmbh/Llama-3.1-8b-instruct_4bitgs64_hqq_calib Text Generation • Updated Aug 27, 2024 • 67 • 55
view post Post 2067 Releasing HQQ Llama-3.1-70b 4-bit quantized version! Check it out at mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq. Achieves 99% of the base model performance across various benchmarks! Details in the model card. 🔥 8 8 + Reply
view post Post 1790 Excited to announce the release of our high-quality Llama-3.1 8B 4-bit HQQ/calibrated quantized model! Achieving an impressive 99.3% relative performance to FP16, it also delivers the fastest inference speed for transformers. mobiuslabsgmbh/Llama-3.1-8b-instruct_4bitgs64_hqq_calib 1 reply · 🔥 9 9 + Reply
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-3bit-metaoffload-HQQ Text Generation • Updated Feb 29, 2024 • 8 • 13
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ Text Generation • Updated Feb 29, 2024 • 10 • 20
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-metaoffload-HQQ Text Generation • Updated Feb 29, 2024 • 12 • 15
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-attn-4bit-moe-2bit-HQQ Text Generation • Updated Jan 8, 2024 • 11 • 6
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-2bit_g16_s128-HQQ Text Generation • Updated Jan 8, 2024 • 36 • 9