mt5-base_V25775 / README.md
emilstabil's picture
End of training
e215a9a
metadata
license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-base_V25775
    results: []

mt5-base_V25775

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1251
  • Rouge1: 28.2174
  • Rouge2: 10.5032
  • Rougel: 19.8511
  • Rougelsum: 23.3756
  • Gen Len: 72.3391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.8396 0.81 500 2.5158 15.7446 6.4867 12.6043 13.8967 31.1588
3.1783 1.61 1000 2.3673 21.0031 8.2119 15.9097 17.9958 46.0086
3.0094 2.42 1500 2.3091 20.5903 8.1394 15.7354 17.696 44.7339
2.8754 3.23 2000 2.2652 22.3129 8.6681 16.4687 18.8755 48.9485
2.7643 4.03 2500 2.2320 22.6675 8.8846 16.7258 19.0948 48.6781
2.7 4.84 3000 2.2190 24.1409 9.4362 17.7197 20.2512 52.8498
2.6373 5.65 3500 2.2100 24.594 9.4296 18.0182 20.6398 55.0687
2.6182 6.45 4000 2.2016 25.0763 9.432 18.1113 20.6752 57.4549
2.5552 7.26 4500 2.1767 26.6143 10.1357 19.004 22.0372 62.6738
2.5319 8.06 5000 2.1665 27.0349 10.3809 19.3472 22.5876 64.7167
2.5145 8.87 5500 2.1705 26.6323 9.956 18.9994 22.119 62.3176
2.4923 9.68 6000 2.1499 27.0052 10.0351 19.2887 22.4559 64.2747
2.4367 10.48 6500 2.1418 27.0134 10.1253 19.2614 22.4648 65.2661
2.4312 11.29 7000 2.1503 27.1655 9.9501 19.1768 22.3967 66.6953
2.4186 12.1 7500 2.1370 26.6422 9.7971 19.0065 22.0444 65.9571
2.3977 12.9 8000 2.1395 27.5204 10.3095 19.4189 22.7497 69.0901
2.3596 13.71 8500 2.1302 27.685 10.1479 19.4521 22.7892 70.0644
2.3951 14.52 9000 2.1298 27.8389 10.2493 19.6671 22.933 70.7897
2.3433 15.32 9500 2.1238 27.9095 10.33 19.6428 22.9721 70.4206
2.3789 16.13 10000 2.1271 28.0755 10.5819 19.9535 23.2605 69.97
2.3331 16.94 10500 2.1240 28.1362 10.4656 19.8198 23.1857 70.9485
2.3395 17.74 11000 2.1245 28.1459 10.4803 19.801 23.2469 71.1288
2.3238 18.55 11500 2.1273 28.2156 10.4437 19.858 23.3457 73.485
2.3181 19.35 12000 2.1251 28.2174 10.5032 19.8511 23.3756 72.3391

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.1.0
  • Datasets 2.12.0
  • Tokenizers 0.13.3