satyanshu404
commited on
Commit
•
f88ed3a
1
Parent(s):
3fdab49
End of training
Browse files- README.md +256 -0
- generation_config.json +7 -0
- model.safetensors +1 -1
README.md
ADDED
@@ -0,0 +1,256 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model: google/long-t5-local-base
|
4 |
+
tags:
|
5 |
+
- generated_from_trainer
|
6 |
+
model-index:
|
7 |
+
- name: long-t5-local-base-finetuned-justification-v06
|
8 |
+
results: []
|
9 |
+
---
|
10 |
+
|
11 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
+
should probably proofread and complete it, then remove this comment. -->
|
13 |
+
|
14 |
+
# long-t5-local-base-finetuned-justification-v06
|
15 |
+
|
16 |
+
This model is a fine-tuned version of [google/long-t5-local-base](https://huggingface.co/google/long-t5-local-base) on the None dataset.
|
17 |
+
It achieves the following results on the evaluation set:
|
18 |
+
- Loss: 1.8645
|
19 |
+
|
20 |
+
## Model description
|
21 |
+
|
22 |
+
More information needed
|
23 |
+
|
24 |
+
## Intended uses & limitations
|
25 |
+
|
26 |
+
More information needed
|
27 |
+
|
28 |
+
## Training and evaluation data
|
29 |
+
|
30 |
+
More information needed
|
31 |
+
|
32 |
+
## Training procedure
|
33 |
+
|
34 |
+
### Training hyperparameters
|
35 |
+
|
36 |
+
The following hyperparameters were used during training:
|
37 |
+
- learning_rate: 2e-05
|
38 |
+
- train_batch_size: 1
|
39 |
+
- eval_batch_size: 1
|
40 |
+
- seed: 42
|
41 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
42 |
+
- lr_scheduler_type: linear
|
43 |
+
- num_epochs: 200
|
44 |
+
|
45 |
+
### Training results
|
46 |
+
|
47 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
48 |
+
|:-------------:|:-----:|:------:|:---------------:|
|
49 |
+
| 5.625 | 1.0 | 676 | 2.2487 |
|
50 |
+
| 1.8303 | 2.0 | 1352 | 1.9383 |
|
51 |
+
| 1.4604 | 3.0 | 2028 | 1.7984 |
|
52 |
+
| 1.4095 | 4.0 | 2704 | 1.6848 |
|
53 |
+
| 1.344 | 5.0 | 3380 | 1.6151 |
|
54 |
+
| 1.2566 | 6.0 | 4056 | 1.5658 |
|
55 |
+
| 1.2083 | 7.0 | 4732 | 1.5206 |
|
56 |
+
| 1.1799 | 8.0 | 5408 | 1.4763 |
|
57 |
+
| 1.1419 | 9.0 | 6084 | 1.4418 |
|
58 |
+
| 1.0928 | 10.0 | 6760 | 1.4131 |
|
59 |
+
| 1.134 | 11.0 | 7436 | 1.3941 |
|
60 |
+
| 1.0297 | 12.0 | 8112 | 1.3594 |
|
61 |
+
| 1.0153 | 13.0 | 8788 | 1.3456 |
|
62 |
+
| 1.0246 | 14.0 | 9464 | 1.3439 |
|
63 |
+
| 0.9392 | 15.0 | 10140 | 1.3199 |
|
64 |
+
| 0.9589 | 16.0 | 10816 | 1.3139 |
|
65 |
+
| 0.9286 | 17.0 | 11492 | 1.3046 |
|
66 |
+
| 0.8812 | 18.0 | 12168 | 1.2886 |
|
67 |
+
| 0.9437 | 19.0 | 12844 | 1.2862 |
|
68 |
+
| 0.8756 | 20.0 | 13520 | 1.2817 |
|
69 |
+
| 0.8738 | 21.0 | 14196 | 1.2762 |
|
70 |
+
| 0.8467 | 22.0 | 14872 | 1.2668 |
|
71 |
+
| 0.8306 | 23.0 | 15548 | 1.2623 |
|
72 |
+
| 0.8471 | 24.0 | 16224 | 1.2637 |
|
73 |
+
| 0.8003 | 25.0 | 16900 | 1.2530 |
|
74 |
+
| 0.8201 | 26.0 | 17576 | 1.2478 |
|
75 |
+
| 0.7935 | 27.0 | 18252 | 1.2573 |
|
76 |
+
| 0.7493 | 28.0 | 18928 | 1.2488 |
|
77 |
+
| 0.772 | 29.0 | 19604 | 1.2480 |
|
78 |
+
| 0.7537 | 30.0 | 20280 | 1.2558 |
|
79 |
+
| 0.7466 | 31.0 | 20956 | 1.2511 |
|
80 |
+
| 0.7481 | 32.0 | 21632 | 1.2561 |
|
81 |
+
| 0.7016 | 33.0 | 22308 | 1.2619 |
|
82 |
+
| 0.7067 | 34.0 | 22984 | 1.2557 |
|
83 |
+
| 0.7206 | 35.0 | 23660 | 1.2493 |
|
84 |
+
| 0.6842 | 36.0 | 24336 | 1.2528 |
|
85 |
+
| 0.6835 | 37.0 | 25012 | 1.2626 |
|
86 |
+
| 0.6799 | 38.0 | 25688 | 1.2605 |
|
87 |
+
| 0.6293 | 39.0 | 26364 | 1.2746 |
|
88 |
+
| 0.6269 | 40.0 | 27040 | 1.2725 |
|
89 |
+
| 0.6341 | 41.0 | 27716 | 1.2671 |
|
90 |
+
| 0.6193 | 42.0 | 28392 | 1.2739 |
|
91 |
+
| 0.6434 | 43.0 | 29068 | 1.2784 |
|
92 |
+
| 0.6 | 44.0 | 29744 | 1.2846 |
|
93 |
+
| 0.5844 | 45.0 | 30420 | 1.3010 |
|
94 |
+
| 0.5801 | 46.0 | 31096 | 1.2964 |
|
95 |
+
| 0.5803 | 47.0 | 31772 | 1.2938 |
|
96 |
+
| 0.5755 | 48.0 | 32448 | 1.2986 |
|
97 |
+
| 0.5703 | 49.0 | 33124 | 1.3067 |
|
98 |
+
| 0.566 | 50.0 | 33800 | 1.2990 |
|
99 |
+
| 0.5356 | 51.0 | 34476 | 1.3021 |
|
100 |
+
| 0.5331 | 52.0 | 35152 | 1.2996 |
|
101 |
+
| 0.5657 | 53.0 | 35828 | 1.3225 |
|
102 |
+
| 0.5195 | 54.0 | 36504 | 1.3237 |
|
103 |
+
| 0.5199 | 55.0 | 37180 | 1.3253 |
|
104 |
+
| 0.5225 | 56.0 | 37856 | 1.3275 |
|
105 |
+
| 0.4995 | 57.0 | 38532 | 1.3347 |
|
106 |
+
| 0.4991 | 58.0 | 39208 | 1.3356 |
|
107 |
+
| 0.4848 | 59.0 | 39884 | 1.3534 |
|
108 |
+
| 0.4731 | 60.0 | 40560 | 1.3557 |
|
109 |
+
| 0.4526 | 61.0 | 41236 | 1.3445 |
|
110 |
+
| 0.47 | 62.0 | 41912 | 1.3547 |
|
111 |
+
| 0.453 | 63.0 | 42588 | 1.3588 |
|
112 |
+
| 0.4508 | 64.0 | 43264 | 1.3694 |
|
113 |
+
| 0.4316 | 65.0 | 43940 | 1.3753 |
|
114 |
+
| 0.4386 | 66.0 | 44616 | 1.3804 |
|
115 |
+
| 0.4243 | 67.0 | 45292 | 1.3797 |
|
116 |
+
| 0.4188 | 68.0 | 45968 | 1.3833 |
|
117 |
+
| 0.4132 | 69.0 | 46644 | 1.3980 |
|
118 |
+
| 0.4244 | 70.0 | 47320 | 1.3960 |
|
119 |
+
| 0.3925 | 71.0 | 47996 | 1.4038 |
|
120 |
+
| 0.3919 | 72.0 | 48672 | 1.4228 |
|
121 |
+
| 0.3933 | 73.0 | 49348 | 1.4173 |
|
122 |
+
| 0.394 | 74.0 | 50024 | 1.4243 |
|
123 |
+
| 0.3916 | 75.0 | 50700 | 1.4224 |
|
124 |
+
| 0.3745 | 76.0 | 51376 | 1.4274 |
|
125 |
+
| 0.3708 | 77.0 | 52052 | 1.4296 |
|
126 |
+
| 0.3667 | 78.0 | 52728 | 1.4342 |
|
127 |
+
| 0.356 | 79.0 | 53404 | 1.4478 |
|
128 |
+
| 0.3546 | 80.0 | 54080 | 1.4431 |
|
129 |
+
| 0.353 | 81.0 | 54756 | 1.4546 |
|
130 |
+
| 0.3473 | 82.0 | 55432 | 1.4520 |
|
131 |
+
| 0.3442 | 83.0 | 56108 | 1.4526 |
|
132 |
+
| 0.3388 | 84.0 | 56784 | 1.4758 |
|
133 |
+
| 0.3044 | 85.0 | 57460 | 1.4715 |
|
134 |
+
| 0.3268 | 86.0 | 58136 | 1.4972 |
|
135 |
+
| 0.3185 | 87.0 | 58812 | 1.4889 |
|
136 |
+
| 0.3092 | 88.0 | 59488 | 1.4899 |
|
137 |
+
| 0.3044 | 89.0 | 60164 | 1.5039 |
|
138 |
+
| 0.3055 | 90.0 | 60840 | 1.4887 |
|
139 |
+
| 0.3014 | 91.0 | 61516 | 1.5114 |
|
140 |
+
| 0.2955 | 92.0 | 62192 | 1.5135 |
|
141 |
+
| 0.2912 | 93.0 | 62868 | 1.5246 |
|
142 |
+
| 0.2969 | 94.0 | 63544 | 1.5319 |
|
143 |
+
| 0.2787 | 95.0 | 64220 | 1.5254 |
|
144 |
+
| 0.288 | 96.0 | 64896 | 1.5278 |
|
145 |
+
| 0.2629 | 97.0 | 65572 | 1.5385 |
|
146 |
+
| 0.2717 | 98.0 | 66248 | 1.5580 |
|
147 |
+
| 0.2728 | 99.0 | 66924 | 1.5702 |
|
148 |
+
| 0.2616 | 100.0 | 67600 | 1.5533 |
|
149 |
+
| 0.2675 | 101.0 | 68276 | 1.5613 |
|
150 |
+
| 0.2477 | 102.0 | 68952 | 1.5610 |
|
151 |
+
| 0.2553 | 103.0 | 69628 | 1.5691 |
|
152 |
+
| 0.2533 | 104.0 | 70304 | 1.5667 |
|
153 |
+
| 0.2557 | 105.0 | 70980 | 1.6058 |
|
154 |
+
| 0.25 | 106.0 | 71656 | 1.5902 |
|
155 |
+
| 0.2386 | 107.0 | 72332 | 1.6042 |
|
156 |
+
| 0.2351 | 108.0 | 73008 | 1.6072 |
|
157 |
+
| 0.2407 | 109.0 | 73684 | 1.6102 |
|
158 |
+
| 0.2297 | 110.0 | 74360 | 1.6164 |
|
159 |
+
| 0.2363 | 111.0 | 75036 | 1.6082 |
|
160 |
+
| 0.2316 | 112.0 | 75712 | 1.6259 |
|
161 |
+
| 0.2313 | 113.0 | 76388 | 1.6233 |
|
162 |
+
| 0.2338 | 114.0 | 77064 | 1.6290 |
|
163 |
+
| 0.204 | 115.0 | 77740 | 1.6390 |
|
164 |
+
| 0.2353 | 116.0 | 78416 | 1.6311 |
|
165 |
+
| 0.2086 | 117.0 | 79092 | 1.6315 |
|
166 |
+
| 0.2052 | 118.0 | 79768 | 1.6337 |
|
167 |
+
| 0.2239 | 119.0 | 80444 | 1.6419 |
|
168 |
+
| 0.2056 | 120.0 | 81120 | 1.6438 |
|
169 |
+
| 0.2037 | 121.0 | 81796 | 1.6513 |
|
170 |
+
| 0.2059 | 122.0 | 82472 | 1.6664 |
|
171 |
+
| 0.1981 | 123.0 | 83148 | 1.6607 |
|
172 |
+
| 0.1918 | 124.0 | 83824 | 1.6807 |
|
173 |
+
| 0.1894 | 125.0 | 84500 | 1.6726 |
|
174 |
+
| 0.1897 | 126.0 | 85176 | 1.6886 |
|
175 |
+
| 0.1887 | 127.0 | 85852 | 1.6848 |
|
176 |
+
| 0.192 | 128.0 | 86528 | 1.6893 |
|
177 |
+
| 0.1961 | 129.0 | 87204 | 1.6995 |
|
178 |
+
| 0.1772 | 130.0 | 87880 | 1.6966 |
|
179 |
+
| 0.189 | 131.0 | 88556 | 1.7023 |
|
180 |
+
| 0.1759 | 132.0 | 89232 | 1.7025 |
|
181 |
+
| 0.1856 | 133.0 | 89908 | 1.7151 |
|
182 |
+
| 0.1792 | 134.0 | 90584 | 1.7162 |
|
183 |
+
| 0.1767 | 135.0 | 91260 | 1.7129 |
|
184 |
+
| 0.1606 | 136.0 | 91936 | 1.7348 |
|
185 |
+
| 0.1788 | 137.0 | 92612 | 1.7216 |
|
186 |
+
| 0.1608 | 138.0 | 93288 | 1.7401 |
|
187 |
+
| 0.1769 | 139.0 | 93964 | 1.7486 |
|
188 |
+
| 0.162 | 140.0 | 94640 | 1.7506 |
|
189 |
+
| 0.1572 | 141.0 | 95316 | 1.7338 |
|
190 |
+
| 0.1585 | 142.0 | 95992 | 1.7441 |
|
191 |
+
| 0.1638 | 143.0 | 96668 | 1.7660 |
|
192 |
+
| 0.172 | 144.0 | 97344 | 1.7548 |
|
193 |
+
| 0.1638 | 145.0 | 98020 | 1.7673 |
|
194 |
+
| 0.1515 | 146.0 | 98696 | 1.7623 |
|
195 |
+
| 0.1713 | 147.0 | 99372 | 1.7516 |
|
196 |
+
| 0.1434 | 148.0 | 100048 | 1.7782 |
|
197 |
+
| 0.1578 | 149.0 | 100724 | 1.7881 |
|
198 |
+
| 0.1473 | 150.0 | 101400 | 1.7795 |
|
199 |
+
| 0.1491 | 151.0 | 102076 | 1.7861 |
|
200 |
+
| 0.1573 | 152.0 | 102752 | 1.7901 |
|
201 |
+
| 0.1472 | 153.0 | 103428 | 1.7969 |
|
202 |
+
| 0.1469 | 154.0 | 104104 | 1.8035 |
|
203 |
+
| 0.1539 | 155.0 | 104780 | 1.7849 |
|
204 |
+
| 0.1473 | 156.0 | 105456 | 1.7996 |
|
205 |
+
| 0.1414 | 157.0 | 106132 | 1.7976 |
|
206 |
+
| 0.1592 | 158.0 | 106808 | 1.8022 |
|
207 |
+
| 0.1284 | 159.0 | 107484 | 1.7968 |
|
208 |
+
| 0.1373 | 160.0 | 108160 | 1.8150 |
|
209 |
+
| 0.1446 | 161.0 | 108836 | 1.8154 |
|
210 |
+
| 0.1382 | 162.0 | 109512 | 1.8139 |
|
211 |
+
| 0.147 | 163.0 | 110188 | 1.8228 |
|
212 |
+
| 0.1385 | 164.0 | 110864 | 1.8222 |
|
213 |
+
| 0.1288 | 165.0 | 111540 | 1.8229 |
|
214 |
+
| 0.1368 | 166.0 | 112216 | 1.8266 |
|
215 |
+
| 0.1343 | 167.0 | 112892 | 1.8313 |
|
216 |
+
| 0.1346 | 168.0 | 113568 | 1.8199 |
|
217 |
+
| 0.1389 | 169.0 | 114244 | 1.8277 |
|
218 |
+
| 0.1432 | 170.0 | 114920 | 1.8330 |
|
219 |
+
| 0.1268 | 171.0 | 115596 | 1.8358 |
|
220 |
+
| 0.1309 | 172.0 | 116272 | 1.8416 |
|
221 |
+
| 0.1344 | 173.0 | 116948 | 1.8289 |
|
222 |
+
| 0.1338 | 174.0 | 117624 | 1.8418 |
|
223 |
+
| 0.1315 | 175.0 | 118300 | 1.8325 |
|
224 |
+
| 0.1245 | 176.0 | 118976 | 1.8351 |
|
225 |
+
| 0.1305 | 177.0 | 119652 | 1.8503 |
|
226 |
+
| 0.1254 | 178.0 | 120328 | 1.8431 |
|
227 |
+
| 0.1223 | 179.0 | 121004 | 1.8506 |
|
228 |
+
| 0.1234 | 180.0 | 121680 | 1.8480 |
|
229 |
+
| 0.1223 | 181.0 | 122356 | 1.8435 |
|
230 |
+
| 0.1304 | 182.0 | 123032 | 1.8530 |
|
231 |
+
| 0.121 | 183.0 | 123708 | 1.8480 |
|
232 |
+
| 0.1284 | 184.0 | 124384 | 1.8550 |
|
233 |
+
| 0.1339 | 185.0 | 125060 | 1.8578 |
|
234 |
+
| 0.1353 | 186.0 | 125736 | 1.8476 |
|
235 |
+
| 0.1219 | 187.0 | 126412 | 1.8550 |
|
236 |
+
| 0.117 | 188.0 | 127088 | 1.8606 |
|
237 |
+
| 0.1269 | 189.0 | 127764 | 1.8588 |
|
238 |
+
| 0.1118 | 190.0 | 128440 | 1.8564 |
|
239 |
+
| 0.1226 | 191.0 | 129116 | 1.8682 |
|
240 |
+
| 0.1284 | 192.0 | 129792 | 1.8582 |
|
241 |
+
| 0.1125 | 193.0 | 130468 | 1.8603 |
|
242 |
+
| 0.1227 | 194.0 | 131144 | 1.8660 |
|
243 |
+
| 0.1373 | 195.0 | 131820 | 1.8660 |
|
244 |
+
| 0.1122 | 196.0 | 132496 | 1.8647 |
|
245 |
+
| 0.1282 | 197.0 | 133172 | 1.8632 |
|
246 |
+
| 0.1199 | 198.0 | 133848 | 1.8625 |
|
247 |
+
| 0.1281 | 199.0 | 134524 | 1.8640 |
|
248 |
+
| 0.1274 | 200.0 | 135200 | 1.8645 |
|
249 |
+
|
250 |
+
|
251 |
+
### Framework versions
|
252 |
+
|
253 |
+
- Transformers 4.38.2
|
254 |
+
- Pytorch 2.2.2+cu121
|
255 |
+
- Datasets 2.18.0
|
256 |
+
- Tokenizers 0.15.2
|
generation_config.json
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_from_model_config": true,
|
3 |
+
"decoder_start_token_id": 0,
|
4 |
+
"eos_token_id": 1,
|
5 |
+
"pad_token_id": 0,
|
6 |
+
"transformers_version": "4.38.2"
|
7 |
+
}
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 990345312
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e0ff30f97d2a579460af9f5c9ff4438df9646c1063ed603368534bf988a32488
|
3 |
size 990345312
|