Edit model card

saaz

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2110

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 0.9750
No log 2.0 2 0.6813
No log 3.0 3 0.4624
No log 4.0 4 0.3387
No log 5.0 5 0.2877
No log 6.0 6 0.2399
No log 7.0 7 0.2235
No log 8.0 8 0.2259
No log 9.0 9 0.2267
No log 10.0 10 0.2259
No log 11.0 11 0.2259
No log 12.0 12 0.2173
No log 13.0 13 0.2126
No log 14.0 14 0.2077
No log 15.0 15 0.2066
No log 16.0 16 0.2067
No log 17.0 17 0.2089
No log 18.0 18 0.2103
No log 19.0 19 0.2108
No log 20.0 20 0.2110

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
7
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for samhitmantrala/saaz

Finetuned
(527)
this model