gghfez
/

WizardLM-2-22b-RP-GGUF

@@ -2,7 +2,8 @@
 license: apache-2.0
 language:
 - en
-base_model: gghfez/WizardLM-2-22b-RP
 pipeline_tag: text-generation
 library_name: transformers
 tags:
@@ -10,50 +11,51 @@ tags:
 - creative
 - writing
 - roleplay
-- llama-cpp
-- gguf-my-repo
 ---
-# gghfez/WizardLM-2-22b-RP-Q6_K-GGUF
-This model was converted to GGUF format from [`gghfez/WizardLM-2-22b-RP`](https://huggingface.co/gghfez/WizardLM-2-22b-RP) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
-Refer to the [original model card](https://huggingface.co/gghfez/WizardLM-2-22b-RP) for more details on the model.
-## Use with llama.cpp
-Install llama.cpp through brew (works on Mac and Linux)
-```bash
-brew install llama.cpp
-```
-Invoke the llama.cpp server or the CLI.
-### CLI:
-```bash
-llama-cli --hf-repo gghfez/WizardLM-2-22b-RP-Q6_K-GGUF --hf-file wizardlm-2-22b-rp-q6_k.gguf -p "The meaning to life and the universe is"
-```
-### Server:
-```bash
-llama-server --hf-repo gghfez/WizardLM-2-22b-RP-Q6_K-GGUF --hf-file wizardlm-2-22b-rp-q6_k.gguf -c 2048
-```
-Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
-Step 1: Clone llama.cpp from GitHub.
-```
-git clone https://github.com/ggerganov/llama.cpp
-```
-Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
-```
-cd llama.cpp && LLAMA_CURL=1 make
-```
-Step 3: Run inference through the main binary.
-```
-./llama-cli --hf-repo gghfez/WizardLM-2-22b-RP-Q6_K-GGUF --hf-file wizardlm-2-22b-rp-q6_k.gguf -p "The meaning to life and the universe is"
-```
-or
 ```
-./llama-server --hf-repo gghfez/WizardLM-2-22b-RP-Q6_K-GGUF --hf-file wizardlm-2-22b-rp-q6_k.gguf -c 2048
 ```

 license: apache-2.0
 language:
 - en
+base_model:
+- gghfez/WizardLM-2-22b-RP
 pipeline_tag: text-generation
 library_name: transformers
 tags:
 - creative
 - writing
 - roleplay
 ---
+GGUF Quants of [gghfez/WizardLM-2-22B-RP](https://huggingface.co/gghfez/WizardLM-2-22B-RP)
+Original Model Card:
+# gghfez/WizardLM2-22b-RP
+<img src="https://files.catbox.moe/acl4ld.png" width="400"/>
+⚠️ **IMPORTANT: Experimental Model - Not recommended for Production Use**
+- This is an experimental model created through bespoke, unorthodox merging techniques
+- The safety alignment and guardrails from the original WizardLM2 model may be compromised
+- This model is intended for creative writing and roleplay purposes ONLY
+- Use at your own risk and with appropriate content filtering in place
+This model is an experimental derivative of WizardLM2-8x22B, created by extracting the individual experts from the original mixture-of-experts (MoE) model, renaming the mlp modules to match the Mistral architecture, and merging them into a single dense model using linear merging via mergekit.
+The resulting model initially produced gibberish, but after fine-tuning on synthetic data generated by the original WizardLM2-8x22B, it regained the ability to generate relatively coherent text. However, the model exhibits confusion about world knowledge and mixes up the names of well known people.
+Despite efforts to train the model on factual data, the confusion persisted, so instead I trained it for creative tasks.
+As a result, this model is not recommended for use as a general assistant or for tasks that require accurate real-world knowledge (don't bother running MMLU-Pro on it).
+It actually retrieves details out of context very accurately, but I still can't recommend it for anything other than creative tasks.
+## Prompt format
+Mistral-v1 + the system tags from Mistral-V7 :
 ```
+[SYSTEM_PROMPT] {system}[SYSTEM_PROMPT] [INST] {prompt}[/INST]
 ```
+**NOTE:** This model is based on WizardLM2-8x22B, which is a finetune of Mixtral-8x22B - not to be confused with the more recent Mistral-Small-22B model.
+As such, it uses the same vocabulary and tokenizer as Mixtral-v0.1 and inherites the Apache2.0 license.
+I expanded the vocab to include the system prompt and instruction tags before training (including embedding heads).
+## Quants
+TODO
+## Examples:
+### Strength: Information Extraction from Context
+[example 1]
+### Weakness: Basic Factual Knowledge
+[example 2]