cs_mT5_0.01_50_v0.1 / README.md
kmok1's picture
End of training
6815c21 verified
|
raw
history blame
4.82 kB
metadata
license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: cs_mT5_0.01_50_v0.1
    results: []

cs_mT5_0.01_50_v0.1

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.3188
  • Bleu: 1.2029
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.6406 1.0 6 6.1758 0.1903 19.0
4.2513 2.0 12 6.4360 0.4971 19.0
3.1515 3.0 18 6.2761 0.1689 19.0
3.4713 4.0 24 6.4576 0.4973 19.0
3.2069 5.0 30 6.6858 0.176 10.0
3.5913 6.0 36 6.2785 0.7212 19.0
3.7814 7.0 42 6.1120 0.7212 19.0
3.2429 8.0 48 6.3660 0.3725 19.0
3.2716 9.0 54 6.6523 0.4214 19.0
3.3443 10.0 60 6.4341 0.3793 19.0
2.4705 11.0 66 6.8433 0.7412 19.0
3.0869 12.0 72 6.9583 0.0 19.0
2.5187 13.0 78 6.3333 1.1569 19.0
3.1211 14.0 84 6.4031 0.2813 19.0
2.7326 15.0 90 6.4055 0.7962 19.0
2.5142 16.0 96 6.5799 0.1843 19.0
3.0964 17.0 102 6.8379 0.9395 19.0
2.5998 18.0 108 6.4570 0.0 19.0
3.2495 19.0 114 6.6350 0.2045 19.0
3.2509 20.0 120 6.3533 0.7212 19.0
3.2998 21.0 126 6.3142 0.6756 19.0
2.7829 22.0 132 6.5953 0.6646 19.0
3.0842 23.0 138 6.6276 0.7056 19.0
1.8502 24.0 144 6.6472 0.2386 19.0
1.945 25.0 150 6.6534 0.6966 19.0
2.7704 26.0 156 7.1955 0.7611 13.0
3.1289 27.0 162 6.6522 0.7286 17.0
3.0663 28.0 168 6.3873 0.8029 19.0
3.4269 29.0 174 6.4310 0.204 19.0
2.7845 30.0 180 6.7221 0.3228 19.0
2.0443 31.0 186 6.8353 0.3228 19.0
3.1621 32.0 192 7.1400 0.1346 19.0
2.4147 33.0 198 6.8844 1.2029 19.0
2.5869 34.0 204 6.7074 0.7475 19.0
2.1119 35.0 210 6.5778 0.7212 19.0
1.7629 36.0 216 6.5553 0.7867 19.0
2.3745 37.0 222 6.7126 0.7663 19.0
2.368 38.0 228 6.8008 0.4815 19.0
2.17 39.0 234 6.6388 0.7892 19.0
2.4311 40.0 240 6.6423 0.3228 19.0
2.8392 41.0 246 6.7127 0.3226 19.0
2.386 42.0 252 6.8011 0.31 19.0
2.7473 43.0 258 6.8704 0.31 19.0
1.9796 44.0 264 6.9846 1.2029 19.0
1.4857 45.0 270 7.1239 1.2029 19.0
1.8413 46.0 276 7.2177 1.194 19.0
2.171 47.0 282 7.2605 1.2029 19.0
1.9659 48.0 288 7.3048 1.2029 19.0
1.3681 49.0 294 7.3093 1.2029 19.0
2.086 50.0 300 7.3188 1.2029 19.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0