Edit model card

long_t5

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5158
  • Rouge1: 0.5214
  • Rouge2: 0.3347
  • Rougel: 0.4751
  • Rougelsum: 0.4746
  • Gen Len: 25.9513

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.232 1.0 1600 1.6810 0.4704 0.2861 0.4256 0.4251 26.6112
2.0229 2.0 3200 1.6167 0.4859 0.2991 0.4412 0.4407 26.1006
1.9239 3.0 4800 1.5805 0.4924 0.3049 0.4475 0.4468 26.8169
1.8454 4.0 6400 1.5669 0.4968 0.3093 0.4517 0.4511 25.925
1.7626 5.0 8000 1.5432 0.4973 0.3132 0.453 0.4525 26.4362
1.6995 6.0 9600 1.5352 0.5045 0.3188 0.4596 0.459 26.1219
1.682 7.0 11200 1.5255 0.5066 0.3198 0.4613 0.4609 26.1581
1.6286 8.0 12800 1.5210 0.5113 0.3245 0.4663 0.466 26.1725
1.593 9.0 14400 1.5195 0.5102 0.3235 0.464 0.4638 25.8944
1.5784 10.0 16000 1.5166 0.5133 0.3265 0.4665 0.4661 25.685
1.5615 11.0 17600 1.5135 0.5161 0.3284 0.47 0.4695 25.8681
1.5391 12.0 19200 1.5106 0.5156 0.3303 0.4703 0.4701 26.1781
1.5077 13.0 20800 1.5095 0.5177 0.3317 0.4724 0.4721 26.0456
1.4923 14.0 22400 1.5163 0.5185 0.3321 0.4728 0.4723 26.17
1.4545 15.0 24000 1.5128 0.5181 0.3337 0.4727 0.4724 25.8219
1.4489 16.0 25600 1.5135 0.5209 0.3349 0.4744 0.4743 26.0369
1.4481 17.0 27200 1.5153 0.5218 0.3349 0.4751 0.4748 26.1744
1.4287 18.0 28800 1.5134 0.521 0.335 0.4752 0.4747 25.9525
1.389 19.0 30400 1.5155 0.5212 0.3348 0.4756 0.4751 26.0369
1.4215 20.0 32000 1.5158 0.5214 0.3347 0.4751 0.4746 25.9513

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu118
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
100
Safetensors
Model size
248M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for zera09/long_t5

Finetuned
(18)
this model