|
--- |
|
base_model: Xclbr7/Arcanum-12b |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- mistral |
|
- trl |
|
--- |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/Fpdr8qCx9Xx4RHWgptCGD.png" width="800"/> |
|
|
|
|
|
# Aether-12b |
|
|
|
Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset. |
|
|
|
## Model Details π |
|
- Developed by: AIXON Lab |
|
- Model type: Causal Language Model |
|
- Language(s): English (primarily), may support other languages |
|
- License: apache-2.0 |
|
- Repository: https://huggingface.co/aixonlab/Aether-12b |
|
|
|
## Model Architecture ποΈ |
|
- Base model: Arcanum-12b |
|
- Parameter count: ~12 billion |
|
- Architecture specifics: Transformer-based language model |
|
|
|
## Open LLM Leaderboard Evaluation Results |
|
Coming Soon ! |
|
|
|
## Training & Fine-tuning π |
|
Aether-12b was fine-tuned on the following dataset: |
|
- Dataset: theprint/CleverBoi-Data-20k |
|
- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision. |
|
|
|
The CleverBoi-Data-20k dataset improved the model in the following ways: |
|
1. Enhanced reasoning and problem-solving capabilities |
|
2. Broader knowledge across various topics |
|
3. Improved performance on specific tasks like writing, analysis, and problem-solving |
|
4. Better contextual understanding and response generation |
|
|
|
## Intended Use π― |
|
As an assistant or specific role bot. |
|
|
|
## Ethical Considerations π€ |
|
As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly. |
|
|
|
|
|
## Acknowledgments π |
|
We acknowledge the contributions of: |
|
- theprint for the amazing CleverBoi-Data-20k dataset |
|
|
|
|