Edit model card

A Mistral-7B pruned50 with Marlin Kernel and AutoGPTQ

Please see my tutorial to execute this model:

https://vilsonrodrigues.medium.com/sparse-quantize-and-serving-llms-with-neuralmagic-autogptq-and-vllm-03961b72ec3a

Downloads last month
9
Safetensors
Model size
1.14B params
Tensor type
I32
·
FP16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.