LinhCT's picture
End of training
62ad8a2 verified
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-amazon-en-es
    results: []

mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5259
  • Rouge1: 36.6783
  • Rouge2: 8.5304
  • Rougel: 26.4419
  • Rougelsum: 26.6455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.833 1.0 250 2.6792 31.8471 7.6517 22.5413 22.6222
3.5861 2.0 500 2.6408 36.8204 8.641 26.7687 26.9114
3.411 3.0 750 2.6037 36.2502 7.9975 26.3962 26.502
3.29 4.0 1000 2.5673 36.7784 8.4415 26.7726 26.9248
3.2199 5.0 1250 2.5568 36.8812 8.7419 26.7704 26.8682
3.1628 6.0 1500 2.5280 37.1871 8.8604 26.9372 27.0992
3.1292 7.0 1750 2.5265 36.6801 8.5876 26.4392 26.5908
3.1129 8.0 2000 2.5259 36.6783 8.5304 26.4419 26.6455

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1