Edit model card

gpt-neo-125m-finetuned-philosopher_rave_20

This model is a fine-tuned version of EleutherAI/gpt-neo-125m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7097

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

20 epochs

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-07
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 155 2.8834
No log 2.0 310 2.8606
No log 3.0 465 2.8407
2.8695 4.0 620 2.8228
2.8695 5.0 775 2.8063
2.8695 6.0 930 2.7911
2.8122 7.0 1085 2.7772
2.8122 8.0 1240 2.7650
2.8122 9.0 1395 2.7544
2.7613 10.0 1550 2.7454
2.7613 11.0 1705 2.7378
2.7613 12.0 1860 2.7313
2.7397 13.0 2015 2.7258
2.7397 14.0 2170 2.7211
2.7397 15.0 2325 2.7173
2.7397 16.0 2480 2.7143
2.7214 17.0 2635 2.7121
2.7214 18.0 2790 2.7106
2.7214 19.0 2945 2.7098
2.7104 20.0 3100 2.7097

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
15
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Triangles/gpt-neo-125m-finetuned-philosopher_rave_20

Finetuned
(125)
this model