Cartinoe5930
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -2,9 +2,28 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
This model is
|
6 |
-
We introduce 'interlocked-DUS', which is a basis of iDUS model.
|
7 |
-
Stay tuned for the update of iDUS model.
|
8 |
|
9 |
-
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
⛔ This model is not iDUS model. This model is variant of them to test the effectiveness of iDUS.
|
|
|
|
|
6 |
|
7 |
+
# interlocked-DUS(iDUS)
|
8 |
+
|
9 |
+
We attempted to improve the performance of the model by further minimizing the layer distance without significantly departing from the framework of DUS.
|
10 |
+
|
11 |
+
## Architectural Details
|
12 |
+
|
13 |
+
We propose **interlocked-DUS(iDUS)** the variant of DUS!
|
14 |
+
As you can see from the name, it does not connect the layers as a whole like DUS but divides into groups and merges them so that they interlock with each other.
|
15 |
+
With this mechanism, iDUS more effectively reduces the layer distance that was important in DUS and has greater strength in processing.
|
16 |
+
The figure below illustrates the overall framework of iDUS.
|
17 |
+
|
18 |
+
<p align="center"><img src="/assets/iDUS.png"></p>
|
19 |
+
|
20 |
+
This model attempted to interlock using one layer as a standard to test the effectiveness of iDUS.
|
21 |
+
|
22 |
+
## 🏆 HuggingFace Open LLM Leaderboard
|
23 |
+
|
24 |
+
|Model|ARC|HellaSwag|MMLU|TruthfulQA|Winogrande|GSM8K|Average|
|
25 |
+
|---|---|---|---|---|---|---|---|
|
26 |
+
|[Llama2_init_Mistral](https://huggingface.co/Cartinoe5930/Llama2_init_Mistral)|60.07|83.3|64.09|42.15|78.37|37.91|60.98|
|
27 |
+
|[SOLAR-10.7B-DUS-Implementation](https://huggingface.co/Cartinoe5930/SOLAR-DUS-implement)|59.56|81.18|63.68|40.72|76.48|26.99|58.1|
|
28 |
+
|**iDUS-1layer**|27.73|26.65|24.91|48.58|49.17|0|29.51|
|
29 |
+
|[iDUS(iDUS-8layer)](https://huggingface.co/Cartinoe5930/SOLAR-10.7B-iDUS)|59.3|81.34|63.22|40.62|76.24|29.57|58.38|
|