File size: 432 Bytes
3e0ec1c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
---
license: llama3
---
This is an experimental 2x8B moe with random gates, using the following 2 models
- Hermes-2-Theta-l3-8B by Nous Research https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B
- llama-3-cat-8B-instruct-V1 by TheSkullery https://huggingface.co/TheSkullery/llama-3-cat-8b-instruct-v1
***Important***
Make sure to add `</s>` a stop sequence as it uses llama-3-cat-8B-instruct-V1 as the base model. |