Want to make a couple models: ( • ̀ω•́ )✧

Downloading the original model, quantizing the GGUF-F16 model, creating imatrix.dat, quantizing the required models and then uploading them to the repository. In total, 4 and a half hours were spent: 💀

(It’s a pity I don’t have normal hardware for quantization)

Link to original model and script:

Downloads last month
5
GGUF
Model size
13B params
Architecture
llama

4-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.