legal-flan-t5-base / README.md
kolpadkar's picture
Upload 7 files
24b8371
|
raw
history blame
6.53 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: output
    results: []

output

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1885
  • Rouge1: 65.4762
  • Rouge2: 0.0
  • Rougel: 65.4762
  • Rougelsum: 65.4762
  • Gen Len: 2.1905

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2679 1.0 42 1.3033 48.8095 0.0 48.8095 48.8095 4.0119
1.0917 2.0 84 1.1075 48.8095 0.0 48.8095 48.8095 2.2738
0.8305 3.0 126 1.0366 45.2381 0.0 45.2381 45.2381 2.3095
0.6058 4.0 168 0.9865 48.8095 0.0 48.8095 48.8095 2.4524
0.5114 5.0 210 0.9289 55.9524 0.0 55.9524 55.9524 2.4048
0.6026 6.0 252 0.9373 53.5714 0.0 53.5714 53.5714 2.3214
0.6428 7.0 294 0.8762 53.5714 0.0 53.5714 53.5714 2.3095
0.5375 8.0 336 0.8908 54.7619 0.0 54.7619 54.7619 2.3333
0.4296 9.0 378 0.9172 50.0 0.0 50.0 50.0 2.3452
0.4644 10.0 420 0.8882 60.7143 0.0 60.7143 60.7143 2.3452
0.42 11.0 462 0.8917 54.7619 0.0 54.7619 54.7619 2.2619
0.3727 12.0 504 0.8710 55.9524 0.0 55.9524 55.9524 2.3571
0.4061 13.0 546 0.8817 54.7619 0.0 54.7619 54.7619 2.2857
0.3221 14.0 588 0.9284 57.1429 0.0 57.1429 57.1429 2.2857
0.3676 15.0 630 0.9313 57.1429 0.0 57.1429 57.1429 2.0476
0.264 16.0 672 0.9315 59.5238 0.0 59.5238 59.5238 2.0595
0.2933 17.0 714 0.9265 64.2857 0.0 64.2857 64.2857 2.1310
0.2446 18.0 756 0.9254 61.9048 0.0 61.9048 61.9048 2.0714
0.2356 19.0 798 0.9390 63.0952 0.0 63.0952 63.0952 2.0714
0.3102 20.0 840 0.9837 61.9048 0.0 61.9048 61.9048 2.1071
0.1539 21.0 882 0.9727 60.7143 0.0 60.7143 60.7143 2.0952
0.1674 22.0 924 1.0114 61.9048 0.0 61.9048 61.9048 2.0952
0.1831 23.0 966 0.9869 61.9048 0.0 61.9048 61.9048 2.0595
0.201 24.0 1008 0.9904 60.7143 0.0 60.7143 60.7143 2.0595
0.1602 25.0 1050 0.9883 60.7143 0.0 60.7143 60.7143 2.0595
0.158 26.0 1092 1.0057 63.0952 0.0 63.0952 63.0952 2.1071
0.1468 27.0 1134 0.9998 67.8571 0.0 67.8571 67.8571 2.1429
0.109 28.0 1176 1.0052 63.0952 0.0 63.0952 63.0952 2.3333
0.1397 29.0 1218 1.0137 65.4762 0.0 65.4762 65.4762 2.3333
0.1204 30.0 1260 1.0482 63.0952 0.0 63.0952 63.0952 2.3452
0.1577 31.0 1302 1.0787 66.6667 0.0 66.6667 66.6667 2.3452
0.1112 32.0 1344 1.0513 63.0952 0.0 63.0952 63.0952 2.3452
0.0932 33.0 1386 1.0786 63.0952 0.0 63.0952 63.0952 2.3452
0.0989 34.0 1428 1.1378 63.0952 0.0 63.0952 63.0952 2.3452
0.0858 35.0 1470 1.1055 65.4762 0.0 65.4762 65.4762 2.3452
0.1056 36.0 1512 1.1297 64.2857 0.0 64.2857 64.2857 2.3571
0.14 37.0 1554 1.1604 64.2857 0.0 64.2857 64.2857 2.3452
0.0592 38.0 1596 1.1213 65.4762 0.0 65.4762 65.4762 2.3452
0.1121 39.0 1638 1.1489 65.4762 0.0 65.4762 65.4762 2.3452
0.1917 40.0 1680 1.1544 64.2857 0.0 64.2857 64.2857 2.3452
0.1178 41.0 1722 1.1561 64.2857 0.0 64.2857 64.2857 2.3452
0.0761 42.0 1764 1.2013 63.0952 0.0 63.0952 63.0952 2.1905
0.0911 43.0 1806 1.2075 64.2857 0.0 64.2857 64.2857 2.1548
0.1081 44.0 1848 1.2134 66.6667 0.0 66.6667 66.6667 2.1548
0.089 45.0 1890 1.1861 64.2857 0.0 64.2857 64.2857 2.1905
0.0828 46.0 1932 1.1988 65.4762 0.0 65.4762 65.4762 2.1905
0.0818 47.0 1974 1.1886 64.2857 0.0 64.2857 64.2857 2.1905
0.0899 48.0 2016 1.1988 64.2857 0.0 64.2857 64.2857 2.1905
0.0923 49.0 2058 1.1968 65.4762 0.0 65.4762 65.4762 2.1905
0.0859 50.0 2100 1.1885 65.4762 0.0 65.4762 65.4762 2.1905

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.10.1
  • Tokenizers 0.13.2