XelotX
/

WizardLM-2-8x22B-XelotX-iQuants

Text Generation

4-bit precision

8-bit precision

arxiv:2304.12244

arxiv:2306.08568

arxiv:2308.09583

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Edit model card

MaziyarPanahi/WizardLM-2-8x22B-GGUF

Model creator: microsoft
Original model: microsoft/WizardLM-2-8x22B

Description

MaziyarPanahi/WizardLM-2-8x22B-GGUF contains GGUF format model files for microsoft/WizardLM-2-8x22B.

How to download

You can download only the quants you need instead of cloning the entire repository as follows:

huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include '*Q2_K*gguf'

On Windows:

huggingface-cli download MaziyarPanahi/WizardLM-2-8x22B-GGUF --local-dir . --include *Q4_K_S*gguf

Load sharded model

llama_load_model_from_file will detect the number of files and will load additional tensors from the rest of files.

llama.cpp/main -m WizardLM-2-8x22B.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e

Prompt template

{system_prompt}
USER: {prompt}
ASSISTANT: </s>

or

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, 
detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s>
USER: {prompt} ASSISTANT: </s>......

Downloads last month: 29

GGUF

Model size

141B params

Architecture

llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples

Text Generation

Inference API (serverless) has been turned off for this model.

Model tree for XelotX/WizardLM-2-8x22B-XelotX-iQuants

Base model

microsoft/WizardLM-2-8x22B

Quantized

(18)

this model

Collection including XelotX/WizardLM-2-8x22B-XelotX-iQuants

WizardLM-2

6 items • Updated 5 days ago