ColleenMacklin's picture
Update README.md
0c5a9a7 verified
metadata
license: mit
tags:
  - generated_from_trainer
base_model: EleutherAI/gpt-neo-125m
model-index:
  - name: gpt-neo-125m-finetuned-philosopher_rave_20
    results: []

gpt-neo-125m-finetuned-philosopher_rave_20

This model is a fine-tuned version of EleutherAI/gpt-neo-125m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7097

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

20 epochs

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-07
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 155 2.8834
No log 2.0 310 2.8606
No log 3.0 465 2.8407
2.8695 4.0 620 2.8228
2.8695 5.0 775 2.8063
2.8695 6.0 930 2.7911
2.8122 7.0 1085 2.7772
2.8122 8.0 1240 2.7650
2.8122 9.0 1395 2.7544
2.7613 10.0 1550 2.7454
2.7613 11.0 1705 2.7378
2.7613 12.0 1860 2.7313
2.7397 13.0 2015 2.7258
2.7397 14.0 2170 2.7211
2.7397 15.0 2325 2.7173
2.7397 16.0 2480 2.7143
2.7214 17.0 2635 2.7121
2.7214 18.0 2790 2.7106
2.7214 19.0 2945 2.7098
2.7104 20.0 3100 2.7097

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2