long-t5-local-base-finetuned-justification-v06
This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.8645
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
5.625 | 1.0 | 676 | 2.2487 |
1.8303 | 2.0 | 1352 | 1.9383 |
1.4604 | 3.0 | 2028 | 1.7984 |
1.4095 | 4.0 | 2704 | 1.6848 |
1.344 | 5.0 | 3380 | 1.6151 |
1.2566 | 6.0 | 4056 | 1.5658 |
1.2083 | 7.0 | 4732 | 1.5206 |
1.1799 | 8.0 | 5408 | 1.4763 |
1.1419 | 9.0 | 6084 | 1.4418 |
1.0928 | 10.0 | 6760 | 1.4131 |
1.134 | 11.0 | 7436 | 1.3941 |
1.0297 | 12.0 | 8112 | 1.3594 |
1.0153 | 13.0 | 8788 | 1.3456 |
1.0246 | 14.0 | 9464 | 1.3439 |
0.9392 | 15.0 | 10140 | 1.3199 |
0.9589 | 16.0 | 10816 | 1.3139 |
0.9286 | 17.0 | 11492 | 1.3046 |
0.8812 | 18.0 | 12168 | 1.2886 |
0.9437 | 19.0 | 12844 | 1.2862 |
0.8756 | 20.0 | 13520 | 1.2817 |
0.8738 | 21.0 | 14196 | 1.2762 |
0.8467 | 22.0 | 14872 | 1.2668 |
0.8306 | 23.0 | 15548 | 1.2623 |
0.8471 | 24.0 | 16224 | 1.2637 |
0.8003 | 25.0 | 16900 | 1.2530 |
0.8201 | 26.0 | 17576 | 1.2478 |
0.7935 | 27.0 | 18252 | 1.2573 |
0.7493 | 28.0 | 18928 | 1.2488 |
0.772 | 29.0 | 19604 | 1.2480 |
0.7537 | 30.0 | 20280 | 1.2558 |
0.7466 | 31.0 | 20956 | 1.2511 |
0.7481 | 32.0 | 21632 | 1.2561 |
0.7016 | 33.0 | 22308 | 1.2619 |
0.7067 | 34.0 | 22984 | 1.2557 |
0.7206 | 35.0 | 23660 | 1.2493 |
0.6842 | 36.0 | 24336 | 1.2528 |
0.6835 | 37.0 | 25012 | 1.2626 |
0.6799 | 38.0 | 25688 | 1.2605 |
0.6293 | 39.0 | 26364 | 1.2746 |
0.6269 | 40.0 | 27040 | 1.2725 |
0.6341 | 41.0 | 27716 | 1.2671 |
0.6193 | 42.0 | 28392 | 1.2739 |
0.6434 | 43.0 | 29068 | 1.2784 |
0.6 | 44.0 | 29744 | 1.2846 |
0.5844 | 45.0 | 30420 | 1.3010 |
0.5801 | 46.0 | 31096 | 1.2964 |
0.5803 | 47.0 | 31772 | 1.2938 |
0.5755 | 48.0 | 32448 | 1.2986 |
0.5703 | 49.0 | 33124 | 1.3067 |
0.566 | 50.0 | 33800 | 1.2990 |
0.5356 | 51.0 | 34476 | 1.3021 |
0.5331 | 52.0 | 35152 | 1.2996 |
0.5657 | 53.0 | 35828 | 1.3225 |
0.5195 | 54.0 | 36504 | 1.3237 |
0.5199 | 55.0 | 37180 | 1.3253 |
0.5225 | 56.0 | 37856 | 1.3275 |
0.4995 | 57.0 | 38532 | 1.3347 |
0.4991 | 58.0 | 39208 | 1.3356 |
0.4848 | 59.0 | 39884 | 1.3534 |
0.4731 | 60.0 | 40560 | 1.3557 |
0.4526 | 61.0 | 41236 | 1.3445 |
0.47 | 62.0 | 41912 | 1.3547 |
0.453 | 63.0 | 42588 | 1.3588 |
0.4508 | 64.0 | 43264 | 1.3694 |
0.4316 | 65.0 | 43940 | 1.3753 |
0.4386 | 66.0 | 44616 | 1.3804 |
0.4243 | 67.0 | 45292 | 1.3797 |
0.4188 | 68.0 | 45968 | 1.3833 |
0.4132 | 69.0 | 46644 | 1.3980 |
0.4244 | 70.0 | 47320 | 1.3960 |
0.3925 | 71.0 | 47996 | 1.4038 |
0.3919 | 72.0 | 48672 | 1.4228 |
0.3933 | 73.0 | 49348 | 1.4173 |
0.394 | 74.0 | 50024 | 1.4243 |
0.3916 | 75.0 | 50700 | 1.4224 |
0.3745 | 76.0 | 51376 | 1.4274 |
0.3708 | 77.0 | 52052 | 1.4296 |
0.3667 | 78.0 | 52728 | 1.4342 |
0.356 | 79.0 | 53404 | 1.4478 |
0.3546 | 80.0 | 54080 | 1.4431 |
0.353 | 81.0 | 54756 | 1.4546 |
0.3473 | 82.0 | 55432 | 1.4520 |
0.3442 | 83.0 | 56108 | 1.4526 |
0.3388 | 84.0 | 56784 | 1.4758 |
0.3044 | 85.0 | 57460 | 1.4715 |
0.3268 | 86.0 | 58136 | 1.4972 |
0.3185 | 87.0 | 58812 | 1.4889 |
0.3092 | 88.0 | 59488 | 1.4899 |
0.3044 | 89.0 | 60164 | 1.5039 |
0.3055 | 90.0 | 60840 | 1.4887 |
0.3014 | 91.0 | 61516 | 1.5114 |
0.2955 | 92.0 | 62192 | 1.5135 |
0.2912 | 93.0 | 62868 | 1.5246 |
0.2969 | 94.0 | 63544 | 1.5319 |
0.2787 | 95.0 | 64220 | 1.5254 |
0.288 | 96.0 | 64896 | 1.5278 |
0.2629 | 97.0 | 65572 | 1.5385 |
0.2717 | 98.0 | 66248 | 1.5580 |
0.2728 | 99.0 | 66924 | 1.5702 |
0.2616 | 100.0 | 67600 | 1.5533 |
0.2675 | 101.0 | 68276 | 1.5613 |
0.2477 | 102.0 | 68952 | 1.5610 |
0.2553 | 103.0 | 69628 | 1.5691 |
0.2533 | 104.0 | 70304 | 1.5667 |
0.2557 | 105.0 | 70980 | 1.6058 |
0.25 | 106.0 | 71656 | 1.5902 |
0.2386 | 107.0 | 72332 | 1.6042 |
0.2351 | 108.0 | 73008 | 1.6072 |
0.2407 | 109.0 | 73684 | 1.6102 |
0.2297 | 110.0 | 74360 | 1.6164 |
0.2363 | 111.0 | 75036 | 1.6082 |
0.2316 | 112.0 | 75712 | 1.6259 |
0.2313 | 113.0 | 76388 | 1.6233 |
0.2338 | 114.0 | 77064 | 1.6290 |
0.204 | 115.0 | 77740 | 1.6390 |
0.2353 | 116.0 | 78416 | 1.6311 |
0.2086 | 117.0 | 79092 | 1.6315 |
0.2052 | 118.0 | 79768 | 1.6337 |
0.2239 | 119.0 | 80444 | 1.6419 |
0.2056 | 120.0 | 81120 | 1.6438 |
0.2037 | 121.0 | 81796 | 1.6513 |
0.2059 | 122.0 | 82472 | 1.6664 |
0.1981 | 123.0 | 83148 | 1.6607 |
0.1918 | 124.0 | 83824 | 1.6807 |
0.1894 | 125.0 | 84500 | 1.6726 |
0.1897 | 126.0 | 85176 | 1.6886 |
0.1887 | 127.0 | 85852 | 1.6848 |
0.192 | 128.0 | 86528 | 1.6893 |
0.1961 | 129.0 | 87204 | 1.6995 |
0.1772 | 130.0 | 87880 | 1.6966 |
0.189 | 131.0 | 88556 | 1.7023 |
0.1759 | 132.0 | 89232 | 1.7025 |
0.1856 | 133.0 | 89908 | 1.7151 |
0.1792 | 134.0 | 90584 | 1.7162 |
0.1767 | 135.0 | 91260 | 1.7129 |
0.1606 | 136.0 | 91936 | 1.7348 |
0.1788 | 137.0 | 92612 | 1.7216 |
0.1608 | 138.0 | 93288 | 1.7401 |
0.1769 | 139.0 | 93964 | 1.7486 |
0.162 | 140.0 | 94640 | 1.7506 |
0.1572 | 141.0 | 95316 | 1.7338 |
0.1585 | 142.0 | 95992 | 1.7441 |
0.1638 | 143.0 | 96668 | 1.7660 |
0.172 | 144.0 | 97344 | 1.7548 |
0.1638 | 145.0 | 98020 | 1.7673 |
0.1515 | 146.0 | 98696 | 1.7623 |
0.1713 | 147.0 | 99372 | 1.7516 |
0.1434 | 148.0 | 100048 | 1.7782 |
0.1578 | 149.0 | 100724 | 1.7881 |
0.1473 | 150.0 | 101400 | 1.7795 |
0.1491 | 151.0 | 102076 | 1.7861 |
0.1573 | 152.0 | 102752 | 1.7901 |
0.1472 | 153.0 | 103428 | 1.7969 |
0.1469 | 154.0 | 104104 | 1.8035 |
0.1539 | 155.0 | 104780 | 1.7849 |
0.1473 | 156.0 | 105456 | 1.7996 |
0.1414 | 157.0 | 106132 | 1.7976 |
0.1592 | 158.0 | 106808 | 1.8022 |
0.1284 | 159.0 | 107484 | 1.7968 |
0.1373 | 160.0 | 108160 | 1.8150 |
0.1446 | 161.0 | 108836 | 1.8154 |
0.1382 | 162.0 | 109512 | 1.8139 |
0.147 | 163.0 | 110188 | 1.8228 |
0.1385 | 164.0 | 110864 | 1.8222 |
0.1288 | 165.0 | 111540 | 1.8229 |
0.1368 | 166.0 | 112216 | 1.8266 |
0.1343 | 167.0 | 112892 | 1.8313 |
0.1346 | 168.0 | 113568 | 1.8199 |
0.1389 | 169.0 | 114244 | 1.8277 |
0.1432 | 170.0 | 114920 | 1.8330 |
0.1268 | 171.0 | 115596 | 1.8358 |
0.1309 | 172.0 | 116272 | 1.8416 |
0.1344 | 173.0 | 116948 | 1.8289 |
0.1338 | 174.0 | 117624 | 1.8418 |
0.1315 | 175.0 | 118300 | 1.8325 |
0.1245 | 176.0 | 118976 | 1.8351 |
0.1305 | 177.0 | 119652 | 1.8503 |
0.1254 | 178.0 | 120328 | 1.8431 |
0.1223 | 179.0 | 121004 | 1.8506 |
0.1234 | 180.0 | 121680 | 1.8480 |
0.1223 | 181.0 | 122356 | 1.8435 |
0.1304 | 182.0 | 123032 | 1.8530 |
0.121 | 183.0 | 123708 | 1.8480 |
0.1284 | 184.0 | 124384 | 1.8550 |
0.1339 | 185.0 | 125060 | 1.8578 |
0.1353 | 186.0 | 125736 | 1.8476 |
0.1219 | 187.0 | 126412 | 1.8550 |
0.117 | 188.0 | 127088 | 1.8606 |
0.1269 | 189.0 | 127764 | 1.8588 |
0.1118 | 190.0 | 128440 | 1.8564 |
0.1226 | 191.0 | 129116 | 1.8682 |
0.1284 | 192.0 | 129792 | 1.8582 |
0.1125 | 193.0 | 130468 | 1.8603 |
0.1227 | 194.0 | 131144 | 1.8660 |
0.1373 | 195.0 | 131820 | 1.8660 |
0.1122 | 196.0 | 132496 | 1.8647 |
0.1282 | 197.0 | 133172 | 1.8632 |
0.1199 | 198.0 | 133848 | 1.8625 |
0.1281 | 199.0 | 134524 | 1.8640 |
0.1274 | 200.0 | 135200 | 1.8645 |
Framework versions
- Transformers 4.38.2
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for satyanshu404/long-t5-local-base-finetuned-justification-v06
Base model
google/long-t5-local-base