Crystalcareai
commited on
Commit
•
adb187c
1
Parent(s):
d269b02
Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
<p align="center"> <img src="https://huggingface.co/Crystalcareai/LlaMoE-Medium/resolve/main/resources/ddb-nye2T3C3vZwJJm1l6A.png" width="auto" title="LlaMoE-Medium model image"> </p>
|
2 |
|
3 |
-
This is a 4x8b Llama Mixture of Experts (MoE) model. It was trained on
|
4 |
|
5 |
The model is a combination of 4 Llama fine-tunes, using DeepSpeed-MoE's architecture. All experts are active for every token.
|
6 |
|
|
|
1 |
<p align="center"> <img src="https://huggingface.co/Crystalcareai/LlaMoE-Medium/resolve/main/resources/ddb-nye2T3C3vZwJJm1l6A.png" width="auto" title="LlaMoE-Medium model image"> </p>
|
2 |
|
3 |
+
This is a 4x8b Llama Mixture of Experts (MoE) model. It was trained on OpenHermes Resort from the Dolphin-2.9 dataset.
|
4 |
|
5 |
The model is a combination of 4 Llama fine-tunes, using DeepSpeed-MoE's architecture. All experts are active for every token.
|
6 |
|