Edit model card

mt5-small-finetuned-arith

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6651
  • Rouge1: 90.0
  • Rouge2: 70.4082
  • Rougel: 85.3061
  • Rougelsum: 85.102

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 64

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 7 11.7623 0.0 0.0 0.0 0.0
No log 2.0 14 11.0473 0.2041 0.0 0.2041 0.2041
No log 3.0 21 9.4965 0.4082 0.0 0.4082 0.4082
No log 4.0 28 8.3848 0.8673 0.0 0.8673 0.8673
No log 5.0 35 7.6170 1.7515 0.0 1.7114 1.6753
No log 6.0 42 7.0008 4.9101 0.0 4.9093 4.8585
No log 7.0 49 6.7836 8.0777 0.0 7.7956 7.9186
16.7453 8.0 56 6.6780 12.3572 0.0 12.1332 11.878
16.7453 9.0 63 5.2800 13.5863 0.1701 12.7907 12.8991
16.7453 10.0 70 4.4990 13.8751 0.1701 13.1962 13.1834
16.7453 11.0 77 4.3624 13.4276 0.1701 13.3009 13.2722
16.7453 12.0 84 4.1101 14.0537 0.3401 13.3534 13.354
16.7453 13.0 91 3.7171 14.2128 0.3401 13.4985 13.4888
16.7453 14.0 98 3.4322 13.9164 0.1701 13.3916 13.3625
16.7453 15.0 105 3.2408 13.931 0.3401 13.7998 13.7901
6.4188 16.0 112 3.0734 14.0816 0.3401 13.7901 13.7901
6.4188 17.0 119 2.9270 14.344 0.8242 14.1983 14.208
6.4188 18.0 126 2.7746 16.7178 2.4928 16.3946 16.4334
6.4188 19.0 133 2.6117 22.7164 7.4678 22.1643 22.1381
6.4188 20.0 140 2.4419 25.0641 9.4306 24.2861 24.2714
6.4188 21.0 147 2.2793 32.0373 13.6803 31.0317 30.8515
6.4188 22.0 154 2.0741 40.1666 21.0894 38.5458 38.4592
6.4188 23.0 161 1.8635 40.1133 21.1222 38.1971 38.1165
3.1581 24.0 168 1.6788 47.1732 25.3843 44.6854 44.6021
3.1581 25.0 175 1.5153 49.4894 27.0538 46.9745 46.8775
3.1581 26.0 182 1.3337 47.7463 25.9589 45.3779 45.2896
3.1581 27.0 189 1.1634 48.6608 26.067 46.293 46.1794
3.1581 28.0 196 1.0392 86.6181 65.5782 81.9242 81.8732
3.1581 29.0 203 0.9519 90.0 70.4082 85.3061 85.102
3.1581 30.0 210 0.8837 90.0 70.4082 85.3061 85.102
3.1581 31.0 217 0.8246 90.0 70.4082 85.3061 85.102
2.0354 32.0 224 0.7630 90.0 70.4082 85.3061 85.102
2.0354 33.0 231 0.7221 90.0 70.4082 85.3061 85.102
2.0354 34.0 238 0.6957 90.0 70.4082 85.3061 85.102
2.0354 35.0 245 0.6852 90.0 70.4082 85.3061 85.102
2.0354 36.0 252 0.6734 90.0 70.4082 85.3061 85.102
2.0354 37.0 259 0.6667 90.0 70.4082 85.3061 85.102
2.0354 38.0 266 0.6670 90.0 70.4082 85.3061 85.102
2.0354 39.0 273 0.6684 90.0 70.4082 85.3061 85.102
1.5363 40.0 280 0.6626 90.0 70.4082 85.3061 85.102
1.5363 41.0 287 0.6621 90.0 70.4082 85.3061 85.102
1.5363 42.0 294 0.6699 90.0 70.4082 85.3061 85.102
1.5363 43.0 301 0.6751 90.0 70.4082 85.3061 85.102
1.5363 44.0 308 0.6839 90.0 70.4082 85.3061 85.102
1.5363 45.0 315 0.6987 90.0 70.4082 85.3061 85.102
1.5363 46.0 322 0.7060 90.0 70.4082 85.3061 85.102
1.5363 47.0 329 0.7125 90.0 70.4082 85.3061 85.102
1.324 48.0 336 0.7103 90.0 70.4082 85.3061 85.102
1.324 49.0 343 0.7098 90.0 70.4082 85.3061 85.102
1.324 50.0 350 0.7088 90.0 70.4082 85.3061 85.102
1.324 51.0 357 0.7112 90.0 70.4082 85.3061 85.102
1.324 52.0 364 0.7094 90.0 70.4082 85.3061 85.102
1.324 53.0 371 0.7041 90.0 70.4082 85.3061 85.102
1.324 54.0 378 0.6939 90.0 70.4082 85.3061 85.102
1.2374 55.0 385 0.6843 90.0 70.4082 85.3061 85.102
1.2374 56.0 392 0.6791 90.0 70.4082 85.3061 85.102
1.2374 57.0 399 0.6755 90.0 70.4082 85.3061 85.102
1.2374 58.0 406 0.6715 90.0 70.4082 85.3061 85.102
1.2374 59.0 413 0.6661 90.0 70.4082 85.3061 85.102
1.2374 60.0 420 0.6639 90.0 70.4082 85.3061 85.102
1.2374 61.0 427 0.6629 90.0 70.4082 85.3061 85.102
1.2374 62.0 434 0.6635 90.0 70.4082 85.3061 85.102
1.199 63.0 441 0.6646 90.0 70.4082 85.3061 85.102
1.199 64.0 448 0.6651 90.0 70.4082 85.3061 85.102

Framework versions

  • Transformers 4.33.1
  • Pytorch 1.12.1
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for milinbhade1214/mt5-small-finetuned-arith

Base model

google/mt5-small
Finetuned
(302)
this model