Transformers
GGUF
llama
Inference Endpoints
ZQ-Dev's picture
Update README.md
1c65e06
metadata
license: llama2
datasets:
  - teknium/GPT4-LLM-Cleaned

Model Card for traclm-v2-7b-instruct-GGUF

This repo contains several GGUF quantizations of TRAC-MTRY/traclm-v2-7b-instruct for utilization of the model on low-resource hardware.

Available quantizations are listed here:

Name Quant method Bits Size Use case
TRAC-MTRY/traclm-v2-7b-instruct-2q_k Q2_K 2 2.83 GB smallest, significant quality loss - not recommended for most purposes
TRAC-MTRY/traclm-v2-7b-instruct-3q_k_m Q3_K_M 3 3.3 GB very small, high quality loss
TRAC-MTRY/traclm-v2-7b-instruct-4q_k_m Q4_K_M 4 4.08 GB medium, balanced quality - recommended
TRAC-MTRY/traclm-v2-7b-instruct-5q_k_m Q5_K_M 5 4.78 GB large, very low quality loss - recommended
TRAC-MTRY/traclm-v2-7b-instruct-6q_k_m Q6_K 6 5.53 GB very large, extremely low quality loss

Note: an fp16 unquantized version in GGUF format is also provided, see repo files.

Read more about GGUF quantization here.

Read more about the unquantized model here.

Prompt Format

This model was fine-tuned with the alpaca prompt format. It is highly recommended that you use the same format for any interactions with the model. Failure to do so will degrade performance significantly.

Standard Alpaca Format:

### System:\nBelow is an instruction that describes a task. Write a response that appropriately completes the request.\n\n\n\n### Instruction:\n{prompt}\n\n### Response:\n "

Input Field Variant:

### System:\nBelow is an instruction that describes a task. Write a response that appropriately completes the request.\n\n\n\n### Instruction:\n{prompt}\n\n###Input:\n{input}\n\n### Response:\n "