Llama 2 7B quantized with AutoGPTQ V0.3.0.
- Group size: 32
- Data type: INT4
This model is compatible with the first version of QA-LoRA.
To fine-tune it with QA-LoRA, follow this tutorial: Fine-tune Quantized Llama 2 on Your GPU with QA-LoRA
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.