|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
# Model Card for Model ID |
|
|
|
This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers. |
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/) |
|
- **Model type:** Causal (Llama 2) |
|
- **Language(s) (NLP):** English |
|
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) |
|
|
|
### Model Sources |
|
|
|
The method and code used to quantize the model are explained here: |
|
[Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL](https://kaitchup.substack.com/p/quantize-and-fine-tune-llms-with) |
|
|
|
## Uses |
|
|
|
This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters. |
|
Note that the 2-bit quantization significantly decreases the performance of Llama 2. |
|
|
|
|
|
## Other versions |
|
|
|
- [kaitchup/Llama-2-7b-gptq-4bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-4bit) |
|
- [kaitchup/Llama-2-7b-gptq-3bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-3bit) |
|
|
|
|
|
|
|
|
|
## Model Card Contact |
|
|
|
[The Kaitchup](https://kaitchup.substack.com/) |
|
|
|
|
|
|