File size: 865 Bytes
3e0ec1c
 
 
 
 
 
 
 
 
 
 
 
 
8bd52c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: llama3
---

This is an experimental 2x8B moe with random gates, using the following 2 models

- Hermes-2-Theta-l3-8B by Nous Research https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B

- llama-3-cat-8B-instruct-V1 by TheSkullery https://huggingface.co/TheSkullery/llama-3-cat-8b-instruct-v1


***Important*** 

Make sure to add `</s>` a stop sequence as it uses llama-3-cat-8B-instruct-V1 as the base model.

Update:

Due to request i decided to add the rest of the quants. Enjoy


Mergekit recipe of the model if too lazy to check the files:

```
base_model: TheSkullery/llama-3-cat-8b-instruct-v1
gate_mode: random
dtype: bfloat16
experts_per_token: 2
experts:
 - source_model: TheSkullery/llama-3-cat-8b-instruct-v1
   positive_prompts:
    - " "
 - source_model: NousResearch/Hermes-2-Theta-Llama-3-8B
   positive_prompts:
    - " "
```