Buseak's picture
update model card README.md
768c73d
|
raw
history blame
2.78 kB
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: spell_corrector_small_v7
    results: []

spell_corrector_small_v7

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5549
  • Bleu: 34.7876
  • Gen Len: 15.7815

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
14.2184 1.0 976 1.6501 16.2132 13.5571
2.8018 2.0 1952 1.2055 23.1195 15.9748
2.0238 3.0 2928 0.9646 26.7454 15.9865
1.6928 4.0 3904 0.8372 28.6482 15.9601
1.4888 5.0 4880 0.7906 29.6306 15.9221
1.3855 6.0 5856 0.7393 30.3841 15.9006
1.2999 7.0 6832 0.7029 31.2225 15.8612
1.2379 8.0 7808 0.6794 31.6015 15.8666
1.1709 9.0 8784 0.6572 32.2153 15.8512
1.1433 10.0 9760 0.6303 32.7529 15.8288
1.1248 11.0 10736 0.6184 33.144 15.8244
1.0703 12.0 11712 0.6072 33.4743 15.8121
1.0547 13.0 12688 0.5937 33.7492 15.8139
1.0275 14.0 13664 0.5779 34.1454 15.7952
1.0122 15.0 14640 0.5727 34.2908 15.7907
1.0071 16.0 15616 0.5662 34.4457 15.7874
1.0017 17.0 16592 0.5609 34.6225 15.7847
0.9879 18.0 17568 0.5575 34.6937 15.7832
0.9814 19.0 18544 0.5554 34.7827 15.7816
0.9793 20.0 19520 0.5549 34.7876 15.7815

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.2
  • Tokenizers 0.13.3