GGML conversion of https://huggingface.co/digitous/Alpacino13b using https://github.com/ggerganov/llama.cpp/pull/896. (Edited to write the model with ftype 2 so it won't be incorrectly identified as 4 - mostly q4_1 some f16.)
GPTQ(cuda) quantization available here: https://huggingface.co/gozfarb/alpacino-13b-4bit-128g
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model authors have turned it off explicitly.