About the phi-4 training Hyperparameters

#1
by zj2023vegetable - opened

Could you provide some details about the pre-training and post-training(SFT & RL) ?

Unsloth AI org

Could you provide some details about the pre-training and post-training(SFT & RL) ?

You'll need to go to our docs which will give some info. https://docs.unsloth.ai/

Generally the one we set in the notebooks are the best

First of all, thank you for your response! I noticed that in your blog (https://unsloth.ai/blog/phi4), you mentioned fixing a fine-tuning bug, so it seems that you have conducted comprehensive SFT post-training on phi-4. In the Colab, there wasn't a detailed introduction on how many GPUs were used during SFT, i.e., it's impossible to infer the global batch size or the number of tokens used in one update step. Additionally, I noticed that the code in the Colab uses LoRA. Is the currently released phi-4 also fine-tuned using LoRA? Could you kindly provide this detail if it's convenient for you?

Sign up or log in to comment