anakin87
/

Phi-3.5-mini-ITA

@@ -23,7 +23,11 @@ Fine-tuned version of [Microsoft/Phi-3.5-mini-instruct](https://huggingface.co/m
 - Small yet powerful model with 3.82 billion parameters
 - Supports 128k context length
-[💬🇮🇹 Chat with the model on Hugging Face Spaces](https://huggingface.co/spaces/anakin87/Phi-3.5-mini-ITA)
 ## 🏆 Evaluation
@@ -112,6 +116,6 @@ It underwent 2 epochs of instruction fine-tuning on the [FineTome-100k](https://
 I adopted a relatively new technique for parameter-efficient learning: [Spectrum](https://arxiv.org/abs/2406.06623).
 The idea is to train only the layers of the model with high Signal-to-Noise Ratio (SNR) and ❄️ freeze the rest.
-Training required about 14 hours on a single A40 GPU.
-I may release a guide/tutorial soon. Stay tuned! 📻

 - Small yet powerful model with 3.82 billion parameters
 - Supports 128k context length
+- [💬🇮🇹 Chat with the model on Hugging Face Spaces](https://huggingface.co/spaces/anakin87/Phi-3.5-mini-ITA)
+- [GGUF quants](https://huggingface.co/QuantFactory/Phi-3.5-mini-ITA-GGUF)
+🏋️‍♂️ **Do you want to understand how the model was trained?**
+Check out the [📖 full walkthrough article](https://huggingface.co/blog/anakin87/spectrum) and the accompanying [💻 notebook](./notebooks/training.ipynb)
 ## 🏆 Evaluation
 I adopted a relatively new technique for parameter-efficient learning: [Spectrum](https://arxiv.org/abs/2406.06623).
 The idea is to train only the layers of the model with high Signal-to-Noise Ratio (SNR) and ❄️ freeze the rest.
+Training required about 14 hours on a single A6000 GPU.
+**For complete training details**, check out the [📖 full walkthrough article](https://huggingface.co/blog/anakin87/spectrum) and the accompanying [💻 notebook](./notebooks/training.ipynb).

notebooks/training.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff