result_llm / README.md
ukzash1's picture
End of training
0241fa9 verified
|
raw
history blame
4.08 kB
metadata
license: apache-2.0
base_model: distilbert/distilgpt2
tags:
  - generated_from_trainer
model-index:
  - name: result_llm
    results: []

result_llm

This model is a fine-tuned version of distilbert/distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: nan

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
8.289 0.0554 500 nan
6.8357 0.1109 1000 nan
6.7413 0.1663 1500 nan
6.6101 0.2218 2000 nan
6.6348 0.2772 2500 nan
6.6871 0.3326 3000 nan
6.602 0.3881 3500 nan
6.6078 0.4435 4000 nan
6.5465 0.4989 4500 nan
6.5643 0.5544 5000 nan
6.5696 0.6098 5500 nan
6.5294 0.6653 6000 nan
6.5638 0.7207 6500 nan
6.4361 0.7761 7000 nan
6.4547 0.8316 7500 nan
6.5327 0.8870 8000 nan
6.3524 0.9425 8500 nan
6.4341 0.9979 9000 nan
6.3677 1.0533 9500 nan
6.199 1.1088 10000 nan
6.3033 1.1642 10500 nan
6.2976 1.2196 11000 nan
6.2322 1.2751 11500 nan
6.2222 1.3305 12000 nan
6.2119 1.3860 12500 nan
6.2336 1.4414 13000 nan
6.349 1.4968 13500 nan
6.311 1.5523 14000 nan
6.2247 1.6077 14500 nan
6.2851 1.6632 15000 nan
6.35 1.7186 15500 nan
6.2996 1.7740 16000 nan
6.3229 1.8295 16500 nan
6.3609 1.8849 17000 nan
6.3063 1.9403 17500 nan
6.2759 1.9958 18000 nan
6.2499 2.0512 18500 nan
6.1473 2.1067 19000 nan
6.2088 2.1621 19500 nan
6.2482 2.2175 20000 nan
6.2123 2.2730 20500 nan
6.2298 2.3284 21000 nan
6.2666 2.3839 21500 nan
6.21 2.4393 22000 nan
6.2396 2.4947 22500 nan
6.2626 2.5502 23000 nan
6.1824 2.6056 23500 nan
6.3142 2.6610 24000 nan
6.2816 2.7165 24500 nan
6.2371 2.7719 25000 nan
6.3075 2.8274 25500 nan
6.2306 2.8828 26000 nan
6.2919 2.9382 26500 nan
6.2668 2.9937 27000 nan

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.4.0+cu121
  • Tokenizers 0.19.1