PLLuM-8x7B-chat GGUF Quantizations by Nondzu

DISCLAIMER: This is a quantized version of an existing model PLLuM-8x7B-chat. I am not the author of the original model. I am only hosting the quantized models. I do not take any responsibility for the models.

Prompt Format

Use the following prompt structure:

???

Available Files

Below is a list of available quantized model files along with their quantization type, file size, and a short description.

Filename Quant Type File Size Description
PLLuM-8x7B-chat-Q2_K.gguf Q2_K 17 GB Very low quality but surprisingly usable.
PLLuM-8x7B-chat-Q3_K.gguf Q3_K 21 GB Low quality, suitable for setups with very limited RAM.
PLLuM-8x7B-chat-Q3_K_L.gguf Q3_K_L 23 GB High quality; recommended for quality-focused usage.
PLLuM-8x7B-chat-Q3_K_M.gguf Q3_K_M 21 GB Very high quality, near perfect output – recommended.
PLLuM-8x7B-chat-Q3_K_S.gguf Q3_K_S 20 GB Moderate quality with improved space efficiency.
PLLuM-8x7B-chat-Q4_K.gguf Q4_K 27 GB Good quality for standard use.
PLLuM-8x7B-chat-Q4_K_M.gguf Q4_K_M 27 GB Default quality for most use cases – recommended.
PLLuM-8x7B-chat-Q4_K_S.gguf Q4_K_S 25 GB Slightly lower quality with enhanced space savings – recommended when size is a priority.
PLLuM-8x7B-chat-Q5_0.gguf Q5_0 31 GB Extremely high quality – the maximum quant available.
PLLuM-8x7B-chat-Q5_K.gguf Q5_K 31 GB Very high quality – recommended for demanding use cases.
PLLuM-8x7B-chat-Q5_K_M.gguf Q5_K_M 31 GB High quality – recommended.
PLLuM-8x7B-chat-Q5_K_S.gguf Q5_K_S 31 GB High quality, offered as an alternative with minimal quality loss.
PLLuM-8x7B-chat-Q6_K.gguf Q6_K 36 GB Very high quality with quantized embed/output weights.
PLLuM-8x7B-chat-Q8_0.gguf Q8_0 47 GB Maximum quality quantization.

Downloading Using Hugging Face CLI

Click to view download instructions

First, ensure you have the Hugging Face CLI installed:

pip install -U "huggingface_hub[cli]"

Then, target a specific file to download:

huggingface-cli download Nondzu/PLLuM-8x7B-chat-GGUF --include "PLLuM-8x7B-chat-Q4_K_M.gguf" --local-dir ./

For larger files, you can specify a new local directory (e.g., PLLuM-8x7B-chat-Q8_0) or download them directly into the current directory (./).

Downloads last month
363
GGUF
Model size
46.7B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Nondzu/PLLuM-8x7B-chat-GGUF

Quantized
(3)
this model