lora

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on the flock_task5_tranning dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0052

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • total_eval_batch_size: 2
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss
1.5231 2.5 10 1.6286
1.4069 5.0 20 1.4773
1.2518 7.5 30 1.3657
1.3069 10.0 40 1.2441
1.0816 12.5 50 1.0924
1.0063 15.0 60 0.9201
0.666 17.5 70 0.7236
0.5723 20.0 80 0.5105
0.3671 22.5 90 0.3136
0.2108 25.0 100 0.1737
0.1203 27.5 110 0.0830
0.069 30.0 120 0.0397
0.0233 32.5 130 0.0212
0.0158 35.0 140 0.0129
0.0104 37.5 150 0.0093
0.0081 40.0 160 0.0076
0.0073 42.5 170 0.0066
0.0072 45.0 180 0.0060
0.0062 47.5 190 0.0056
0.0063 50.0 200 0.0054
0.0068 52.5 210 0.0053
0.0064 55.0 220 0.0052
0.0061 57.5 230 0.0052
0.0056 60.0 240 0.0052

Framework versions

  • PEFT 0.12.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
72
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for jerseyjerry/task-5-microsoft-Phi-3-mini-4k-instruct-20250301

Adapter
(764)
this model