Mistral 7B v0.1 Quantized by asya.ai
Mistral 7B v0.1 - AWQ
- Model creator: Mistral AI
- Original model: Mistral 7B v0.1
Description
This repo contains AWQ model files
Original model takes up 15GB GPU RAM and achieves 0.368 response score
Quantized version uses 5GB GPU RAM and achieves 0.329 response score
Models validated on HuggingFaceH4/no_robots dataset and evaluated as cosine similarity for response embeddings
How to use this AWQ model from Python code
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_name_or_path = "asya-ai/Mistral-7B-v0.1-AWQ"
# Load model
model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True,
trust_remote_code=False, safetensors=True)
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=False)
prompt = "Tell me about AI"
prompt_template=f'''{prompt}
'''
print("\n\n*** Generate:")
tokens = tokenizer(
prompt_template,
return_tensors='pt'
).input_ids.cuda()
# Generate output
generation_output = model.generate(
tokens,
do_sample=True,
temperature=0.7,
top_p=0.95,
top_k=40,
max_new_tokens=512
)
print("Output: ", tokenizer.decode(generation_output[0]))
Compatibility
The files provided are tested to work with:
Original model card: Mistral AI's Mistral 7B v0.1
Model Card for Mistral-7B-v0.1
The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
For full details of this model please read our Release blog post
Model Architecture
Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
- Grouped-Query Attention
- Sliding-Window Attention
- Byte-fallback BPE tokenizer
The Mistral AI Team
Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
- Downloads last month
- 12
Model tree for asya-ai/Mistral-7B-v0.1-AWQ
Base model
mistralai/Mistral-7B-v0.1