--- license: llama3 --- This is an experimental 2x8B moe with random gates, using the following 2 models - Hermes-2-Theta-l3-8B by Nous Research https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B - llama-3-cat-8B-instruct-V1 by TheSkullery https://huggingface.co/TheSkullery/llama-3-cat-8b-instruct-v1 ***Important*** Make sure to add `` a stop sequence as it uses llama-3-cat-8B-instruct-V1 as the base model. Update: Due to request i decided to add the rest of the quants. Enjoy Mergekit recipe of the model if too lazy to check the files: ``` base_model: TheSkullery/llama-3-cat-8b-instruct-v1 gate_mode: random dtype: bfloat16 experts_per_token: 2 experts: - source_model: TheSkullery/llama-3-cat-8b-instruct-v1 positive_prompts: - " " - source_model: NousResearch/Hermes-2-Theta-Llama-3-8B positive_prompts: - " " ```