davelotito's picture
End of training
dccaf62 verified
metadata
license: mit
base_model: davelotito/donut-base-sroie-test
tags:
  - generated_from_trainer
datasets:
  - imagefolder
metrics:
  - bleu
  - wer
model-index:
  - name: donut-base-sroie-test
    results: []

donut-base-sroie-test

This model is a fine-tuned version of davelotito/donut-base-sroie-test on the imagefolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3913
  • Bleu: 0.0706
  • Precisions: [0.8125, 0.7440860215053764, 0.7064676616915423, 0.6637168141592921]
  • Brevity Penalty: 0.0968
  • Length Ratio: 0.2998
  • Translation Length: 528
  • Reference Length: 1761
  • Cer: 0.7448
  • Wer: 0.8259

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Cer Wer
No log 0.99 62 0.5515 0.0638 [0.7647058823529411, 0.6831896551724138, 0.6309226932668329, 0.5857988165680473] 0.0962 0.2993 527 1761 0.7627 0.8549
0.5624 2.0 125 0.4773 0.0665 [0.7763157894736842, 0.6865671641791045, 0.6403940886699507, 0.5918367346938775] 0.0992 0.3021 532 1761 0.7562 0.8390
0.5624 2.99 187 0.4273 0.0658 [0.7840909090909091, 0.6903225806451613, 0.6517412935323383, 0.6047197640117994] 0.0968 0.2998 528 1761 0.7513 0.8373
0.2964 4.0 250 0.4007 0.0679 [0.800376647834275, 0.7072649572649573, 0.6592592592592592, 0.6023391812865497] 0.0986 0.3015 531 1761 0.7478 0.8286
0.2238 4.99 312 0.3965 0.0710 [0.8142589118198874, 0.7297872340425532, 0.683046683046683, 0.6308139534883721] 0.0999 0.3027 533 1761 0.7427 0.8271
0.2238 6.0 375 0.3939 0.0719 [0.8301886792452831, 0.7537473233404711, 0.7054455445544554, 0.656891495601173] 0.0980 0.3010 530 1761 0.7414 0.8246
0.147 6.99 437 0.3853 0.0693 [0.8159392789373814, 0.7370689655172413, 0.6932668329177057, 0.6449704142011834] 0.0962 0.2993 527 1761 0.7437 0.8237
0.1316 8.0 500 0.3827 0.0698 [0.8037735849056604, 0.7301927194860813, 0.6856435643564357, 0.6392961876832844] 0.0980 0.3010 530 1761 0.7456 0.8296
0.1316 8.99 562 0.3895 0.0704 [0.8174904942965779, 0.7516198704103672, 0.715, 0.6706231454005934] 0.0956 0.2987 526 1761 0.7439 0.8259
0.1153 9.92 620 0.3913 0.0706 [0.8125, 0.7440860215053764, 0.7064676616915423, 0.6637168141592921] 0.0968 0.2998 528 1761 0.7448 0.8259

Framework versions

  • Transformers 4.40.0.dev0
  • Pytorch 2.1.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2