aixonlab
/

Aether-12b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Xclbr7 commited on Sep 25, 2024

Commit

ac4e8f4

·

verified ·

1 Parent(s): dcb80ff

Update README.md

Files changed (1) hide show

README.md +38 -6

README.md CHANGED Viewed

@@ -11,12 +11,44 @@ tags:
 - trl
 ---
-# Uploaded  model
-- **Developed by:** aixonlab
-- **License:** apache-2.0
-- **Finetuned from model :** Xclbr7/Arcanum-12b
-This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - trl
 ---
+# Aether-12b
+Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.
+## Model Details 📊
+- Developed by: AIXON Lab
+- Model type: Causal Language Model
+- Language(s): English (primarily), may support other languages
+- License: apache-2.0
+- Repository: https://huggingface.co/aixonlab/Aether-12b
+## Model Architecture 🏗️
+- Base model: Arcanum-12b
+- Parameter count: ~12 billion
+- Architecture specifics: Transformer-based language model
+## Open LLM Leaderboard Evaluation Results
+Coming Soon !
+## Training & Fine-tuning 🔄
+Aether-12b was fine-tuned on the following dataset:
+- Dataset: theprint/CleverBoi-Data-20k
+- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.
+The CleverBoi-Data-20k dataset improved the model in the following ways:
+1. Enhanced reasoning and problem-solving capabilities
+2. Broader knowledge across various topics
+3. Improved performance on specific tasks like writing, analysis, and problem-solving
+4. Better contextual understanding and response generation
+## Intended Use 🎯
+As an assistant or specific role bot.
+## Ethical Considerations 🤔
+As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.
+## Acknowledgments 🙏
+We acknowledge the contributions of:
+- theprint for the amazing CleverBoi-Data-20k dataset