--- license: apache-2.0 base_model: t5-small tags: - generated_from_trainer metrics: - rouge model-index: - name: t5-small-finetuned-xsum results: [] --- # t5-small-finetuned-xsum This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.0734 - Rouge1: 99.9038 - Rouge2: 99.838 - Rougel: 99.9145 - Rougelsum: 99.9038 - Gen Len: 93.9181 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 40 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:| | No log | 1.0 | 180 | 1.7815 | 9.7268 | 2.7047 | 8.7069 | 8.7035 | 155.8472 | | No log | 2.0 | 360 | 0.6270 | 28.7135 | 19.99 | 27.1646 | 27.1386 | 265.2903 | | 2.122 | 3.0 | 540 | 0.3572 | 21.4211 | 17.5143 | 21.0387 | 20.9118 | 142.7333 | | 2.122 | 4.0 | 720 | 0.2757 | 92.8223 | 90.5077 | 92.0061 | 92.0015 | 87.0847 | | 2.122 | 5.0 | 900 | 0.2493 | 95.6972 | 94.5082 | 95.5057 | 95.522 | 91.8556 | | 0.4002 | 6.0 | 1080 | 0.2348 | 96.8942 | 96.2704 | 96.7552 | 96.7736 | 96.0764 | | 0.4002 | 7.0 | 1260 | 0.2227 | 97.7669 | 97.4255 | 97.6867 | 97.6913 | 93.9097 | | 0.4002 | 8.0 | 1440 | 0.2111 | 98.7823 | 98.5538 | 98.7622 | 98.7722 | 94.2875 | | 0.2717 | 9.0 | 1620 | 0.1979 | 99.7455 | 99.6524 | 99.7428 | 99.7449 | 93.8569 | | 0.2717 | 10.0 | 1800 | 0.1843 | 99.8967 | 99.8175 | 99.8953 | 99.8939 | 93.875 | | 0.2717 | 11.0 | 1980 | 0.1716 | 99.9078 | 99.8578 | 99.9114 | 99.9095 | 93.8556 | | 0.2244 | 12.0 | 2160 | 0.1606 | 99.9371 | 99.8807 | 99.9373 | 99.9373 | 93.9236 | | 0.2244 | 13.0 | 2340 | 0.1512 | 99.9112 | 99.8535 | 99.9141 | 99.9103 | 93.8542 | | 0.19 | 14.0 | 2520 | 0.1424 | 99.9573 | 99.919 | 99.9573 | 99.9573 | 93.9236 | | 0.19 | 15.0 | 2700 | 0.1353 | 99.9679 | 99.9421 | 99.9679 | 99.9679 | 93.925 | | 0.19 | 16.0 | 2880 | 0.1290 | 99.9234 | 99.8727 | 99.9323 | 99.9234 | 93.8736 | | 0.1652 | 17.0 | 3060 | 0.1235 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9222 | | 0.1652 | 18.0 | 3240 | 0.1184 | 99.9038 | 99.8373 | 99.911 | 99.9021 | 93.8722 | | 0.1652 | 19.0 | 3420 | 0.1137 | 99.9466 | 99.9074 | 99.9573 | 99.9466 | 93.9236 | | 0.1471 | 20.0 | 3600 | 0.1092 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9222 | | 0.1471 | 21.0 | 3780 | 0.1053 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9222 | | 0.1471 | 22.0 | 3960 | 0.1014 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9222 | | 0.1331 | 23.0 | 4140 | 0.0982 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9222 | | 0.1331 | 24.0 | 4320 | 0.0949 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9208 | | 0.1226 | 25.0 | 4500 | 0.0918 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9208 | | 0.1226 | 26.0 | 4680 | 0.0892 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9208 | | 0.1226 | 27.0 | 4860 | 0.0867 | 99.9252 | 99.8727 | 99.9359 | 99.9252 | 93.9208 | | 0.114 | 28.0 | 5040 | 0.0848 | 99.9145 | 99.8495 | 99.9252 | 99.9145 | 93.9194 | | 0.114 | 29.0 | 5220 | 0.0828 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.114 | 30.0 | 5400 | 0.0811 | 99.9145 | 99.8495 | 99.9252 | 99.9145 | 93.9194 | | 0.1074 | 31.0 | 5580 | 0.0794 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.1074 | 32.0 | 5760 | 0.0781 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.1074 | 33.0 | 5940 | 0.0769 | 99.9252 | 99.8669 | 99.9252 | 99.9252 | 93.9194 | | 0.1027 | 34.0 | 6120 | 0.0757 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.1027 | 35.0 | 6300 | 0.0751 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.1027 | 36.0 | 6480 | 0.0745 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.0994 | 37.0 | 6660 | 0.0740 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.0994 | 38.0 | 6840 | 0.0737 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.0975 | 39.0 | 7020 | 0.0735 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | | 0.0975 | 40.0 | 7200 | 0.0734 | 99.9038 | 99.838 | 99.9145 | 99.9038 | 93.9181 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.17.0 - Tokenizers 0.15.2