Crystalcareai
commited on
Commit
•
2ab3b14
1
Parent(s):
817b0d6
Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ datasets:
|
|
10 |
|
11 |
I'm excited to share an early release of a project that has kept me busy for the last couple of weeks. Mixtral's release propelled me into a deep dive into MoEs. This led to my first experiments with post-training, starting with fine tuning using monsterapi around the middle of December, and later transitioning to axolotl as I got more comfortable with command lines and terminals.
|
12 |
|
13 |
-
With the release of Qwen1.5, I was curious to see how it would compare to Mixtral. Thanks to
|
14 |
|
15 |
Coming from a background as an acting teacher and coach, I saw parallels between high-quality scripts' impact on performances and the importance of curating high-quality data for training models. This led me to explore data curation, especially for training Mixture of Experts (MoE) models. I looked into Teknium's OpenHermes dataset, Jon Durbin's collections on GitHub, and Eric Hartford's methods for achieving specific outcomes with models.
|
16 |
|
|
|
10 |
|
11 |
I'm excited to share an early release of a project that has kept me busy for the last couple of weeks. Mixtral's release propelled me into a deep dive into MoEs. This led to my first experiments with post-training, starting with fine tuning using monsterapi around the middle of December, and later transitioning to axolotl as I got more comfortable with command lines and terminals.
|
12 |
|
13 |
+
With the release of Qwen1.5, I was curious to see how it would compare to Mixtral. Thanks to lazymergekit, which simplifies the process for newcomers, I was able to give Qwen1.5-7B a unique twist.
|
14 |
|
15 |
Coming from a background as an acting teacher and coach, I saw parallels between high-quality scripts' impact on performances and the importance of curating high-quality data for training models. This led me to explore data curation, especially for training Mixture of Experts (MoE) models. I looked into Teknium's OpenHermes dataset, Jon Durbin's collections on GitHub, and Eric Hartford's methods for achieving specific outcomes with models.
|
16 |
|