satyanshu404 commited on
Commit
f88ed3a
1 Parent(s): 3fdab49

End of training

Browse files
Files changed (3) hide show
  1. README.md +256 -0
  2. generation_config.json +7 -0
  3. model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,256 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/long-t5-local-base
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: long-t5-local-base-finetuned-justification-v06
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # long-t5-local-base-finetuned-justification-v06
15
+
16
+ This model is a fine-tuned version of [google/long-t5-local-base](https://huggingface.co/google/long-t5-local-base) on the None dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 1.8645
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 2e-05
38
+ - train_batch_size: 1
39
+ - eval_batch_size: 1
40
+ - seed: 42
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 200
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-----:|:------:|:---------------:|
49
+ | 5.625 | 1.0 | 676 | 2.2487 |
50
+ | 1.8303 | 2.0 | 1352 | 1.9383 |
51
+ | 1.4604 | 3.0 | 2028 | 1.7984 |
52
+ | 1.4095 | 4.0 | 2704 | 1.6848 |
53
+ | 1.344 | 5.0 | 3380 | 1.6151 |
54
+ | 1.2566 | 6.0 | 4056 | 1.5658 |
55
+ | 1.2083 | 7.0 | 4732 | 1.5206 |
56
+ | 1.1799 | 8.0 | 5408 | 1.4763 |
57
+ | 1.1419 | 9.0 | 6084 | 1.4418 |
58
+ | 1.0928 | 10.0 | 6760 | 1.4131 |
59
+ | 1.134 | 11.0 | 7436 | 1.3941 |
60
+ | 1.0297 | 12.0 | 8112 | 1.3594 |
61
+ | 1.0153 | 13.0 | 8788 | 1.3456 |
62
+ | 1.0246 | 14.0 | 9464 | 1.3439 |
63
+ | 0.9392 | 15.0 | 10140 | 1.3199 |
64
+ | 0.9589 | 16.0 | 10816 | 1.3139 |
65
+ | 0.9286 | 17.0 | 11492 | 1.3046 |
66
+ | 0.8812 | 18.0 | 12168 | 1.2886 |
67
+ | 0.9437 | 19.0 | 12844 | 1.2862 |
68
+ | 0.8756 | 20.0 | 13520 | 1.2817 |
69
+ | 0.8738 | 21.0 | 14196 | 1.2762 |
70
+ | 0.8467 | 22.0 | 14872 | 1.2668 |
71
+ | 0.8306 | 23.0 | 15548 | 1.2623 |
72
+ | 0.8471 | 24.0 | 16224 | 1.2637 |
73
+ | 0.8003 | 25.0 | 16900 | 1.2530 |
74
+ | 0.8201 | 26.0 | 17576 | 1.2478 |
75
+ | 0.7935 | 27.0 | 18252 | 1.2573 |
76
+ | 0.7493 | 28.0 | 18928 | 1.2488 |
77
+ | 0.772 | 29.0 | 19604 | 1.2480 |
78
+ | 0.7537 | 30.0 | 20280 | 1.2558 |
79
+ | 0.7466 | 31.0 | 20956 | 1.2511 |
80
+ | 0.7481 | 32.0 | 21632 | 1.2561 |
81
+ | 0.7016 | 33.0 | 22308 | 1.2619 |
82
+ | 0.7067 | 34.0 | 22984 | 1.2557 |
83
+ | 0.7206 | 35.0 | 23660 | 1.2493 |
84
+ | 0.6842 | 36.0 | 24336 | 1.2528 |
85
+ | 0.6835 | 37.0 | 25012 | 1.2626 |
86
+ | 0.6799 | 38.0 | 25688 | 1.2605 |
87
+ | 0.6293 | 39.0 | 26364 | 1.2746 |
88
+ | 0.6269 | 40.0 | 27040 | 1.2725 |
89
+ | 0.6341 | 41.0 | 27716 | 1.2671 |
90
+ | 0.6193 | 42.0 | 28392 | 1.2739 |
91
+ | 0.6434 | 43.0 | 29068 | 1.2784 |
92
+ | 0.6 | 44.0 | 29744 | 1.2846 |
93
+ | 0.5844 | 45.0 | 30420 | 1.3010 |
94
+ | 0.5801 | 46.0 | 31096 | 1.2964 |
95
+ | 0.5803 | 47.0 | 31772 | 1.2938 |
96
+ | 0.5755 | 48.0 | 32448 | 1.2986 |
97
+ | 0.5703 | 49.0 | 33124 | 1.3067 |
98
+ | 0.566 | 50.0 | 33800 | 1.2990 |
99
+ | 0.5356 | 51.0 | 34476 | 1.3021 |
100
+ | 0.5331 | 52.0 | 35152 | 1.2996 |
101
+ | 0.5657 | 53.0 | 35828 | 1.3225 |
102
+ | 0.5195 | 54.0 | 36504 | 1.3237 |
103
+ | 0.5199 | 55.0 | 37180 | 1.3253 |
104
+ | 0.5225 | 56.0 | 37856 | 1.3275 |
105
+ | 0.4995 | 57.0 | 38532 | 1.3347 |
106
+ | 0.4991 | 58.0 | 39208 | 1.3356 |
107
+ | 0.4848 | 59.0 | 39884 | 1.3534 |
108
+ | 0.4731 | 60.0 | 40560 | 1.3557 |
109
+ | 0.4526 | 61.0 | 41236 | 1.3445 |
110
+ | 0.47 | 62.0 | 41912 | 1.3547 |
111
+ | 0.453 | 63.0 | 42588 | 1.3588 |
112
+ | 0.4508 | 64.0 | 43264 | 1.3694 |
113
+ | 0.4316 | 65.0 | 43940 | 1.3753 |
114
+ | 0.4386 | 66.0 | 44616 | 1.3804 |
115
+ | 0.4243 | 67.0 | 45292 | 1.3797 |
116
+ | 0.4188 | 68.0 | 45968 | 1.3833 |
117
+ | 0.4132 | 69.0 | 46644 | 1.3980 |
118
+ | 0.4244 | 70.0 | 47320 | 1.3960 |
119
+ | 0.3925 | 71.0 | 47996 | 1.4038 |
120
+ | 0.3919 | 72.0 | 48672 | 1.4228 |
121
+ | 0.3933 | 73.0 | 49348 | 1.4173 |
122
+ | 0.394 | 74.0 | 50024 | 1.4243 |
123
+ | 0.3916 | 75.0 | 50700 | 1.4224 |
124
+ | 0.3745 | 76.0 | 51376 | 1.4274 |
125
+ | 0.3708 | 77.0 | 52052 | 1.4296 |
126
+ | 0.3667 | 78.0 | 52728 | 1.4342 |
127
+ | 0.356 | 79.0 | 53404 | 1.4478 |
128
+ | 0.3546 | 80.0 | 54080 | 1.4431 |
129
+ | 0.353 | 81.0 | 54756 | 1.4546 |
130
+ | 0.3473 | 82.0 | 55432 | 1.4520 |
131
+ | 0.3442 | 83.0 | 56108 | 1.4526 |
132
+ | 0.3388 | 84.0 | 56784 | 1.4758 |
133
+ | 0.3044 | 85.0 | 57460 | 1.4715 |
134
+ | 0.3268 | 86.0 | 58136 | 1.4972 |
135
+ | 0.3185 | 87.0 | 58812 | 1.4889 |
136
+ | 0.3092 | 88.0 | 59488 | 1.4899 |
137
+ | 0.3044 | 89.0 | 60164 | 1.5039 |
138
+ | 0.3055 | 90.0 | 60840 | 1.4887 |
139
+ | 0.3014 | 91.0 | 61516 | 1.5114 |
140
+ | 0.2955 | 92.0 | 62192 | 1.5135 |
141
+ | 0.2912 | 93.0 | 62868 | 1.5246 |
142
+ | 0.2969 | 94.0 | 63544 | 1.5319 |
143
+ | 0.2787 | 95.0 | 64220 | 1.5254 |
144
+ | 0.288 | 96.0 | 64896 | 1.5278 |
145
+ | 0.2629 | 97.0 | 65572 | 1.5385 |
146
+ | 0.2717 | 98.0 | 66248 | 1.5580 |
147
+ | 0.2728 | 99.0 | 66924 | 1.5702 |
148
+ | 0.2616 | 100.0 | 67600 | 1.5533 |
149
+ | 0.2675 | 101.0 | 68276 | 1.5613 |
150
+ | 0.2477 | 102.0 | 68952 | 1.5610 |
151
+ | 0.2553 | 103.0 | 69628 | 1.5691 |
152
+ | 0.2533 | 104.0 | 70304 | 1.5667 |
153
+ | 0.2557 | 105.0 | 70980 | 1.6058 |
154
+ | 0.25 | 106.0 | 71656 | 1.5902 |
155
+ | 0.2386 | 107.0 | 72332 | 1.6042 |
156
+ | 0.2351 | 108.0 | 73008 | 1.6072 |
157
+ | 0.2407 | 109.0 | 73684 | 1.6102 |
158
+ | 0.2297 | 110.0 | 74360 | 1.6164 |
159
+ | 0.2363 | 111.0 | 75036 | 1.6082 |
160
+ | 0.2316 | 112.0 | 75712 | 1.6259 |
161
+ | 0.2313 | 113.0 | 76388 | 1.6233 |
162
+ | 0.2338 | 114.0 | 77064 | 1.6290 |
163
+ | 0.204 | 115.0 | 77740 | 1.6390 |
164
+ | 0.2353 | 116.0 | 78416 | 1.6311 |
165
+ | 0.2086 | 117.0 | 79092 | 1.6315 |
166
+ | 0.2052 | 118.0 | 79768 | 1.6337 |
167
+ | 0.2239 | 119.0 | 80444 | 1.6419 |
168
+ | 0.2056 | 120.0 | 81120 | 1.6438 |
169
+ | 0.2037 | 121.0 | 81796 | 1.6513 |
170
+ | 0.2059 | 122.0 | 82472 | 1.6664 |
171
+ | 0.1981 | 123.0 | 83148 | 1.6607 |
172
+ | 0.1918 | 124.0 | 83824 | 1.6807 |
173
+ | 0.1894 | 125.0 | 84500 | 1.6726 |
174
+ | 0.1897 | 126.0 | 85176 | 1.6886 |
175
+ | 0.1887 | 127.0 | 85852 | 1.6848 |
176
+ | 0.192 | 128.0 | 86528 | 1.6893 |
177
+ | 0.1961 | 129.0 | 87204 | 1.6995 |
178
+ | 0.1772 | 130.0 | 87880 | 1.6966 |
179
+ | 0.189 | 131.0 | 88556 | 1.7023 |
180
+ | 0.1759 | 132.0 | 89232 | 1.7025 |
181
+ | 0.1856 | 133.0 | 89908 | 1.7151 |
182
+ | 0.1792 | 134.0 | 90584 | 1.7162 |
183
+ | 0.1767 | 135.0 | 91260 | 1.7129 |
184
+ | 0.1606 | 136.0 | 91936 | 1.7348 |
185
+ | 0.1788 | 137.0 | 92612 | 1.7216 |
186
+ | 0.1608 | 138.0 | 93288 | 1.7401 |
187
+ | 0.1769 | 139.0 | 93964 | 1.7486 |
188
+ | 0.162 | 140.0 | 94640 | 1.7506 |
189
+ | 0.1572 | 141.0 | 95316 | 1.7338 |
190
+ | 0.1585 | 142.0 | 95992 | 1.7441 |
191
+ | 0.1638 | 143.0 | 96668 | 1.7660 |
192
+ | 0.172 | 144.0 | 97344 | 1.7548 |
193
+ | 0.1638 | 145.0 | 98020 | 1.7673 |
194
+ | 0.1515 | 146.0 | 98696 | 1.7623 |
195
+ | 0.1713 | 147.0 | 99372 | 1.7516 |
196
+ | 0.1434 | 148.0 | 100048 | 1.7782 |
197
+ | 0.1578 | 149.0 | 100724 | 1.7881 |
198
+ | 0.1473 | 150.0 | 101400 | 1.7795 |
199
+ | 0.1491 | 151.0 | 102076 | 1.7861 |
200
+ | 0.1573 | 152.0 | 102752 | 1.7901 |
201
+ | 0.1472 | 153.0 | 103428 | 1.7969 |
202
+ | 0.1469 | 154.0 | 104104 | 1.8035 |
203
+ | 0.1539 | 155.0 | 104780 | 1.7849 |
204
+ | 0.1473 | 156.0 | 105456 | 1.7996 |
205
+ | 0.1414 | 157.0 | 106132 | 1.7976 |
206
+ | 0.1592 | 158.0 | 106808 | 1.8022 |
207
+ | 0.1284 | 159.0 | 107484 | 1.7968 |
208
+ | 0.1373 | 160.0 | 108160 | 1.8150 |
209
+ | 0.1446 | 161.0 | 108836 | 1.8154 |
210
+ | 0.1382 | 162.0 | 109512 | 1.8139 |
211
+ | 0.147 | 163.0 | 110188 | 1.8228 |
212
+ | 0.1385 | 164.0 | 110864 | 1.8222 |
213
+ | 0.1288 | 165.0 | 111540 | 1.8229 |
214
+ | 0.1368 | 166.0 | 112216 | 1.8266 |
215
+ | 0.1343 | 167.0 | 112892 | 1.8313 |
216
+ | 0.1346 | 168.0 | 113568 | 1.8199 |
217
+ | 0.1389 | 169.0 | 114244 | 1.8277 |
218
+ | 0.1432 | 170.0 | 114920 | 1.8330 |
219
+ | 0.1268 | 171.0 | 115596 | 1.8358 |
220
+ | 0.1309 | 172.0 | 116272 | 1.8416 |
221
+ | 0.1344 | 173.0 | 116948 | 1.8289 |
222
+ | 0.1338 | 174.0 | 117624 | 1.8418 |
223
+ | 0.1315 | 175.0 | 118300 | 1.8325 |
224
+ | 0.1245 | 176.0 | 118976 | 1.8351 |
225
+ | 0.1305 | 177.0 | 119652 | 1.8503 |
226
+ | 0.1254 | 178.0 | 120328 | 1.8431 |
227
+ | 0.1223 | 179.0 | 121004 | 1.8506 |
228
+ | 0.1234 | 180.0 | 121680 | 1.8480 |
229
+ | 0.1223 | 181.0 | 122356 | 1.8435 |
230
+ | 0.1304 | 182.0 | 123032 | 1.8530 |
231
+ | 0.121 | 183.0 | 123708 | 1.8480 |
232
+ | 0.1284 | 184.0 | 124384 | 1.8550 |
233
+ | 0.1339 | 185.0 | 125060 | 1.8578 |
234
+ | 0.1353 | 186.0 | 125736 | 1.8476 |
235
+ | 0.1219 | 187.0 | 126412 | 1.8550 |
236
+ | 0.117 | 188.0 | 127088 | 1.8606 |
237
+ | 0.1269 | 189.0 | 127764 | 1.8588 |
238
+ | 0.1118 | 190.0 | 128440 | 1.8564 |
239
+ | 0.1226 | 191.0 | 129116 | 1.8682 |
240
+ | 0.1284 | 192.0 | 129792 | 1.8582 |
241
+ | 0.1125 | 193.0 | 130468 | 1.8603 |
242
+ | 0.1227 | 194.0 | 131144 | 1.8660 |
243
+ | 0.1373 | 195.0 | 131820 | 1.8660 |
244
+ | 0.1122 | 196.0 | 132496 | 1.8647 |
245
+ | 0.1282 | 197.0 | 133172 | 1.8632 |
246
+ | 0.1199 | 198.0 | 133848 | 1.8625 |
247
+ | 0.1281 | 199.0 | 134524 | 1.8640 |
248
+ | 0.1274 | 200.0 | 135200 | 1.8645 |
249
+
250
+
251
+ ### Framework versions
252
+
253
+ - Transformers 4.38.2
254
+ - Pytorch 2.2.2+cu121
255
+ - Datasets 2.18.0
256
+ - Tokenizers 0.15.2
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "decoder_start_token_id": 0,
4
+ "eos_token_id": 1,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.38.2"
7
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d1f6b9bf40855c93dc036b5f097a49cebe225961772728513a08d2feb8e34d96
3
  size 990345312
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0ff30f97d2a579460af9f5c9ff4438df9646c1063ed603368534bf988a32488
3
  size 990345312