Update README.md
Browse files
README.md
CHANGED
@@ -9,17 +9,43 @@ tags:
|
|
9 |
- mergekit
|
10 |
---
|
11 |
|
12 |
-
# **
|
13 |
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
## **Caractéristiques**
|
17 |
- **Méthode de fusion :** SLERP (Spherical Linear Interpolation).
|
18 |
- **Modèles sources :**
|
19 |
-
-
|
20 |
-
-
|
21 |
-
- **
|
22 |
-
-
|
|
|
|
|
23 |
- Raisonnement mathématique.
|
24 |
- Compréhension contextuelle.
|
25 |
- Tâches instructives (Instruction Following).
|
@@ -29,12 +55,12 @@ tags:
|
|
29 |
```yaml
|
30 |
slices:
|
31 |
- sources:
|
32 |
-
- model:
|
33 |
-
layer_range: [0,
|
34 |
-
- model: Sakalti/ultiima-
|
35 |
-
layer_range: [0,
|
36 |
-
|
37 |
-
base_model:
|
38 |
parameters:
|
39 |
t:
|
40 |
- filter: self_attn
|
@@ -42,4 +68,5 @@ parameters:
|
|
42 |
- filter: mlp
|
43 |
value: [1, 0.75, 0.5, 0.25, 0]
|
44 |
- value: 0.5
|
45 |
-
dtype: bfloat16
|
|
|
|
9 |
- mergekit
|
10 |
---
|
11 |
|
12 |
+
# **ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1**
|
13 |
|
14 |
+
This model has been produced by:
|
15 |
+
- **ROBERGE Marial**, engineering student at French Engineering School ECE
|
16 |
+
- **ESCRIVA Mathis**, engineering student at French Engineering School ECE
|
17 |
+
- **LALAIN Youri**, engineering student at French Engineering School ECE
|
18 |
+
- **RAGE LILIAN**, engineering student at French Engineering School ECE
|
19 |
+
- **HUVELLE Baptiste**, engineering student at French Engineering School ECE
|
20 |
+
|
21 |
+
Under the supervision of:
|
22 |
+
- **Andre-Louis Rochet**, Lecturer at ECE & Co-Founder of TW3 Partners
|
23 |
+
- **Paul Lemaistre**, CTO of TW3 Partners
|
24 |
+
|
25 |
+
With the contribution of:
|
26 |
+
- **ECE engineering school** as sponsor and financial contributor
|
27 |
+
- **François STEPHAN** as director of ECE
|
28 |
+
- **Gérard REUS** as acting director of iLAB
|
29 |
+
- **Matthieu JOLLARD** ECE Alumni
|
30 |
+
- **Louis GARCIA** ECE Alumni
|
31 |
+
|
32 |
+
### Supervisory structure
|
33 |
+
The iLab (intelligence Lab) is a structure created by the ECE and dedicated to artificial intelligence
|
34 |
+
|
35 |
+
### About ECE
|
36 |
+
ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions.
|
37 |
+
|
38 |
+
**ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1** est un modèle de langage fusionné créé à partir des modèles **Sakalti/ultiima-72B** et **MaziyarPanahi/calme-3.2-instruct-78b**. Grâce à la méthode **SLERP (Spherical Linear Interpolation)**, il combine les forces des deux architectures pour offrir des performances optimales sur des tâches complexes de traitement du langage naturel (NLP).
|
39 |
|
40 |
## **Caractéristiques**
|
41 |
- **Méthode de fusion :** SLERP (Spherical Linear Interpolation).
|
42 |
- **Modèles sources :**
|
43 |
+
- [Sakalti/ultiima-72B](https://huggingface.co/Sakalti/ultiima-72B)
|
44 |
+
- [MaziyarPanahi/calme-3.2-instruct-78b](https://huggingface.co/MaziyarPanahi/calme-3.2-instruct-78b)
|
45 |
+
- **Points forts :**
|
46 |
+
- Performances améliorées sur des tâches multi-domaines et de raisonnement.
|
47 |
+
- Capacité de traitement étendue grâce à la fusion des couches critiques.
|
48 |
+
- **Applications cibles :**
|
49 |
- Raisonnement mathématique.
|
50 |
- Compréhension contextuelle.
|
51 |
- Tâches instructives (Instruction Following).
|
|
|
55 |
```yaml
|
56 |
slices:
|
57 |
- sources:
|
58 |
+
- model: MaziyarPanahi/calme-3.2-instruct-78b
|
59 |
+
layer_range: [0, 80] # Limité à 80 couches
|
60 |
+
- model: Sakalti/ultiima-72B
|
61 |
+
layer_range: [0, 80] # Correspondance avec le 78B
|
62 |
+
merge_method: slerp
|
63 |
+
base_model: MaziyarPanahi/calme-3.2-instruct-78b
|
64 |
parameters:
|
65 |
t:
|
66 |
- filter: self_attn
|
|
|
68 |
- filter: mlp
|
69 |
value: [1, 0.75, 0.5, 0.25, 0]
|
70 |
- value: 0.5
|
71 |
+
dtype: bfloat16
|
72 |
+
```
|