anthonyyazdaniml
/

Llama-3.1-8B-Instruct-BioQA-AWQ

Text Generation

4-bit precision

Model card Files Files and versions Community

Llama-3.1-8B-Instruct-BioQA-AWQ

A quantized version of Llama-3.1-8B-Instruct. The model was quantized using AutoAWQ with biomedical question-answering (QA) data as calibration.

Key Details

Base Model: Llama-3.1-8B-Instruct
Calibration Data: Biomedical question-answering (QA)
Template: Official Llama chat format

Quantization Config

quant_config = {
    "zero_point": True,
    "q_group_size": 128,
    "w_bit": 4,
    "version": "GEMM"
}

License

The model follows the license of the base Llama-3.1-8B-Instruct model.

Downloads last month: 66

Safetensors

Model size

1.98B params

Tensor type

I32

·

FP16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for anthonyyazdaniml/Llama-3.1-8B-Instruct-BioQA-AWQ

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

(335)

this model