DaMedSumT5-large / README.md
emilstabil's picture
update model card README.md
a174d0f
|
raw
history blame
2.06 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-large_V8901_V89881_V17542
    results: []

mt5-large_V8901_V89881_V17542

This model is a fine-tuned version of emilstabil/mt5-large_V8901_V89881 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7588
  • Rouge1: 27.8059
  • Rouge2: 10.1962
  • Rougel: 15.3479
  • Rougelsum: 26.1829
  • Gen Len: 540.9451

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 11

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.871 2.11 500 1.7570 27.3094 9.9205 15.0662 25.6006 544.1983
1.7982 4.21 1000 1.7661 27.6824 9.8242 15.2022 25.9032 537.4388
1.7311 6.32 1500 1.7713 27.3113 9.6774 14.9355 25.6303 539.827
1.7432 8.42 2000 1.7636 27.8775 10.0766 15.1629 26.1014 537.5359
1.7775 10.53 2500 1.7588 27.8059 10.1962 15.3479 26.1829 540.9451

Framework versions

  • Transformers 4.30.2
  • Pytorch 1.12.1+git7548e2f
  • Datasets 2.13.2
  • Tokenizers 0.13.3