Update README.md
Browse files
README.md
CHANGED
@@ -11,12 +11,44 @@ tags:
|
|
11 |
- trl
|
12 |
---
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
-
|
17 |
-
- **License:** apache-2.0
|
18 |
-
- **Finetuned from model :** Xclbr7/Arcanum-12b
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
-
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
11 |
- trl
|
12 |
---
|
13 |
|
14 |
+
# Aether-12b
|
15 |
|
16 |
+
Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.
|
|
|
|
|
17 |
|
18 |
+
## Model Details π
|
19 |
+
- Developed by: AIXON Lab
|
20 |
+
- Model type: Causal Language Model
|
21 |
+
- Language(s): English (primarily), may support other languages
|
22 |
+
- License: apache-2.0
|
23 |
+
- Repository: https://huggingface.co/aixonlab/Aether-12b
|
24 |
+
|
25 |
+
## Model Architecture ποΈ
|
26 |
+
- Base model: Arcanum-12b
|
27 |
+
- Parameter count: ~12 billion
|
28 |
+
- Architecture specifics: Transformer-based language model
|
29 |
+
|
30 |
+
## Open LLM Leaderboard Evaluation Results
|
31 |
+
Coming Soon !
|
32 |
+
|
33 |
+
## Training & Fine-tuning π
|
34 |
+
Aether-12b was fine-tuned on the following dataset:
|
35 |
+
- Dataset: theprint/CleverBoi-Data-20k
|
36 |
+
- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.
|
37 |
+
|
38 |
+
The CleverBoi-Data-20k dataset improved the model in the following ways:
|
39 |
+
1. Enhanced reasoning and problem-solving capabilities
|
40 |
+
2. Broader knowledge across various topics
|
41 |
+
3. Improved performance on specific tasks like writing, analysis, and problem-solving
|
42 |
+
4. Better contextual understanding and response generation
|
43 |
+
|
44 |
+
## Intended Use π―
|
45 |
+
As an assistant or specific role bot.
|
46 |
+
|
47 |
+
## Ethical Considerations π€
|
48 |
+
As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.
|
49 |
+
|
50 |
+
|
51 |
+
## Acknowledgments π
|
52 |
+
We acknowledge the contributions of:
|
53 |
+
- theprint for the amazing CleverBoi-Data-20k dataset
|
54 |
|
|