Thalesian commited on
Commit
aae9b2c
·
verified ·
1 Parent(s): d69c48e

End of training

Browse files
README.md CHANGED
@@ -9,12 +9,12 @@ model-index:
9
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
  should probably proofread and complete it, then remove this comment. -->
11
 
12
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/thalesian-university-of-new-mexico-press/huggingface/runs/lzwqc83m)
13
  # train_2
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 0.3439
18
 
19
  ## Model description
20
 
@@ -33,47 +33,168 @@ More information needed
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
- - learning_rate: 5e-05
37
  - train_batch_size: 32
38
  - eval_batch_size: 32
39
  - seed: 42
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
- - lr_scheduler_type: linear
42
- - lr_scheduler_warmup_steps: 5000
43
- - num_epochs: 200
44
 
45
  ### Training results
46
 
47
- | Training Loss | Epoch | Step | Validation Loss |
48
- |:-------------:|:-----:|:-----:|:---------------:|
49
- | 0.4089 | 1.0 | 1988 | 1.0441 |
50
- | 0.2536 | 2.0 | 3976 | 0.5032 |
51
- | 0.2009 | 3.0 | 5964 | 0.4043 |
52
- | 0.1859 | 4.0 | 7952 | 0.3853 |
53
- | 0.1824 | 5.0 | 9940 | 0.3758 |
54
- | 0.1818 | 6.0 | 11928 | 0.3654 |
55
- | 0.1788 | 7.0 | 13916 | 0.3665 |
56
- | 0.1753 | 8.0 | 15904 | 0.3614 |
57
- | 0.1741 | 9.0 | 17892 | 0.3569 |
58
- | 0.1676 | 10.0 | 19880 | 0.3625 |
59
- | 0.1737 | 11.0 | 21868 | 0.3558 |
60
- | 0.166 | 12.0 | 23856 | 0.3531 |
61
- | 0.1664 | 13.0 | 25844 | 0.3662 |
62
- | 0.1656 | 14.0 | 27832 | 0.3522 |
63
- | 0.1655 | 15.0 | 29820 | 0.3578 |
64
- | 0.1635 | 16.0 | 31808 | 0.3517 |
65
- | 0.1607 | 17.0 | 33796 | 0.3611 |
66
- | 0.1617 | 18.0 | 35784 | 0.3497 |
67
- | 0.1606 | 19.0 | 37772 | 0.3429 |
68
- | 0.1548 | 20.0 | 39760 | 0.3541 |
69
- | 0.1563 | 21.0 | 41748 | 0.3443 |
70
- | 0.1592 | 22.0 | 43736 | 0.3500 |
71
- | 0.1554 | 23.0 | 45724 | 0.3371 |
72
- | 0.1557 | 24.0 | 47712 | 0.3476 |
73
- | 0.1548 | 25.0 | 49700 | 0.3471 |
74
- | 0.1524 | 26.0 | 51688 | 0.3392 |
75
- | 0.1539 | 27.0 | 53676 | 0.3488 |
76
- | 0.1483 | 28.0 | 55664 | 0.3439 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
 
79
  ### Framework versions
 
9
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
  should probably proofread and complete it, then remove this comment. -->
11
 
12
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/thalesian-university-of-new-mexico-press/huggingface/runs/5dytrarp)
13
  # train_2
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 2.0585
18
 
19
  ## Model description
20
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
+ - learning_rate: 5e-08
37
  - train_batch_size: 32
38
  - eval_batch_size: 32
39
  - seed: 42
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
+ - lr_scheduler_type: inverse_sqrt
42
+ - lr_scheduler_warmup_steps: 10000
43
+ - num_epochs: 1000
44
 
45
  ### Training results
46
 
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-----:|:------:|:---------------:|
49
+ | 1.2945 | 1.0 | 1988 | 8.0900 |
50
+ | 1.2924 | 2.0 | 3976 | 8.0917 |
51
+ | 1.2384 | 3.0 | 5964 | 7.8635 |
52
+ | 1.1923 | 4.0 | 7952 | 7.5532 |
53
+ | 1.1749 | 5.0 | 9940 | 7.0878 |
54
+ | 1.1125 | 6.0 | 11928 | 6.7368 |
55
+ | 1.0792 | 7.0 | 13916 | 6.4160 |
56
+ | 1.0603 | 8.0 | 15904 | 6.1504 |
57
+ | 1.0207 | 9.0 | 17892 | 5.8524 |
58
+ | 0.9642 | 10.0 | 19880 | 5.5657 |
59
+ | 0.9807 | 11.0 | 21868 | 5.3785 |
60
+ | 0.9153 | 12.0 | 23856 | 5.4038 |
61
+ | 0.9148 | 13.0 | 25844 | 5.1502 |
62
+ | 0.9351 | 14.0 | 27832 | 5.0736 |
63
+ | 0.9018 | 15.0 | 29820 | 4.9524 |
64
+ | 0.8817 | 16.0 | 31808 | 4.8684 |
65
+ | 0.8734 | 17.0 | 33796 | 4.8746 |
66
+ | 0.8526 | 18.0 | 35784 | 4.7093 |
67
+ | 0.8478 | 19.0 | 37772 | 4.6303 |
68
+ | 0.8323 | 20.0 | 39760 | 4.4704 |
69
+ | 0.8272 | 21.0 | 41748 | 4.4963 |
70
+ | 0.8388 | 22.0 | 43736 | 4.2419 |
71
+ | 0.8082 | 23.0 | 45724 | 4.3082 |
72
+ | 0.7752 | 24.0 | 47712 | 4.2786 |
73
+ | 0.7942 | 25.0 | 49700 | 4.1537 |
74
+ | 0.774 | 26.0 | 51688 | 4.0926 |
75
+ | 0.7832 | 27.0 | 53676 | 4.0640 |
76
+ | 0.7452 | 28.0 | 55664 | 4.0172 |
77
+ | 0.7676 | 29.0 | 57652 | 3.9586 |
78
+ | 0.7311 | 30.0 | 59640 | 3.9717 |
79
+ | 0.7376 | 31.0 | 61628 | 3.8749 |
80
+ | 0.7316 | 32.0 | 63616 | 3.7216 |
81
+ | 0.7196 | 33.0 | 65604 | 3.7663 |
82
+ | 0.7285 | 34.0 | 67592 | 3.7143 |
83
+ | 0.7095 | 35.0 | 69580 | 3.6517 |
84
+ | 0.6921 | 36.0 | 71568 | 3.6316 |
85
+ | 0.6997 | 37.0 | 73556 | 3.6626 |
86
+ | 0.6903 | 38.0 | 75544 | 3.5780 |
87
+ | 0.6799 | 39.0 | 77532 | 3.5831 |
88
+ | 0.6941 | 40.0 | 79520 | 3.4966 |
89
+ | 0.6882 | 41.0 | 81508 | 3.3806 |
90
+ | 0.6874 | 42.0 | 83496 | 3.4282 |
91
+ | 0.6681 | 43.0 | 85484 | 3.3843 |
92
+ | 0.6654 | 44.0 | 87472 | 3.4136 |
93
+ | 0.6716 | 45.0 | 89460 | 3.4052 |
94
+ | 0.6688 | 46.0 | 91448 | 3.3275 |
95
+ | 0.6631 | 47.0 | 93436 | 3.3487 |
96
+ | 0.6419 | 48.0 | 95424 | 3.2867 |
97
+ | 0.6528 | 49.0 | 97412 | 3.2035 |
98
+ | 0.6427 | 50.0 | 99400 | 3.2311 |
99
+ | 0.6514 | 51.0 | 101388 | 3.1894 |
100
+ | 0.6406 | 52.0 | 103376 | 3.2018 |
101
+ | 0.6382 | 53.0 | 105364 | 3.1603 |
102
+ | 0.6282 | 54.0 | 107352 | 3.0533 |
103
+ | 0.6327 | 55.0 | 109340 | 3.0959 |
104
+ | 0.6284 | 56.0 | 111328 | 3.0642 |
105
+ | 0.6434 | 57.0 | 113316 | 3.0728 |
106
+ | 0.6105 | 58.0 | 115304 | 3.0209 |
107
+ | 0.6354 | 59.0 | 117292 | 2.9889 |
108
+ | 0.6145 | 60.0 | 119280 | 2.9779 |
109
+ | 0.6126 | 61.0 | 121268 | 2.9797 |
110
+ | 0.6126 | 62.0 | 123256 | 2.9048 |
111
+ | 0.6234 | 63.0 | 125244 | 2.9306 |
112
+ | 0.6086 | 64.0 | 127232 | 2.8959 |
113
+ | 0.6023 | 65.0 | 129220 | 2.9148 |
114
+ | 0.597 | 66.0 | 131208 | 2.8671 |
115
+ | 0.6064 | 67.0 | 133196 | 2.8509 |
116
+ | 0.598 | 68.0 | 135184 | 2.8174 |
117
+ | 0.6093 | 69.0 | 137172 | 2.7873 |
118
+ | 0.5935 | 70.0 | 139160 | 2.8055 |
119
+ | 0.5747 | 71.0 | 141148 | 2.7558 |
120
+ | 0.587 | 72.0 | 143136 | 2.7970 |
121
+ | 0.5748 | 73.0 | 145124 | 2.7178 |
122
+ | 0.5705 | 74.0 | 147112 | 2.7279 |
123
+ | 0.5729 | 75.0 | 149100 | 2.6740 |
124
+ | 0.5654 | 76.0 | 151088 | 2.6869 |
125
+ | 0.5677 | 77.0 | 153076 | 2.6599 |
126
+ | 0.5703 | 78.0 | 155064 | 2.6833 |
127
+ | 0.5553 | 79.0 | 157052 | 2.6079 |
128
+ | 0.5638 | 80.0 | 159040 | 2.5833 |
129
+ | 0.5604 | 81.0 | 161028 | 2.5944 |
130
+ | 0.5579 | 82.0 | 163016 | 2.5878 |
131
+ | 0.5486 | 83.0 | 165004 | 2.5828 |
132
+ | 0.561 | 84.0 | 166992 | 2.6022 |
133
+ | 0.5584 | 85.0 | 168980 | 2.5213 |
134
+ | 0.5582 | 86.0 | 170968 | 2.5448 |
135
+ | 0.5523 | 87.0 | 172956 | 2.5131 |
136
+ | 0.5589 | 88.0 | 174944 | 2.5138 |
137
+ | 0.548 | 89.0 | 176932 | 2.5272 |
138
+ | 0.5594 | 90.0 | 178920 | 2.5176 |
139
+ | 0.5366 | 91.0 | 180908 | 2.4901 |
140
+ | 0.5457 | 92.0 | 182896 | 2.4668 |
141
+ | 0.5457 | 93.0 | 184884 | 2.4619 |
142
+ | 0.5458 | 94.0 | 186872 | 2.4393 |
143
+ | 0.5431 | 95.0 | 188860 | 2.4425 |
144
+ | 0.5338 | 96.0 | 190848 | 2.4113 |
145
+ | 0.5432 | 97.0 | 192836 | 2.4143 |
146
+ | 0.5351 | 98.0 | 194824 | 2.3863 |
147
+ | 0.5255 | 99.0 | 196812 | 2.3990 |
148
+ | 0.5345 | 100.0 | 198800 | 2.3748 |
149
+ | 0.5305 | 101.0 | 200788 | 2.3814 |
150
+ | 0.5405 | 102.0 | 202776 | 2.3421 |
151
+ | 0.5198 | 103.0 | 204764 | 2.3486 |
152
+ | 0.5197 | 104.0 | 206752 | 2.3400 |
153
+ | 0.53 | 105.0 | 208740 | 2.3484 |
154
+ | 0.5304 | 106.0 | 210728 | 2.3230 |
155
+ | 0.5164 | 107.0 | 212716 | 2.3068 |
156
+ | 0.5227 | 108.0 | 214704 | 2.2744 |
157
+ | 0.5141 | 109.0 | 216692 | 2.3020 |
158
+ | 0.5189 | 110.0 | 218680 | 2.2691 |
159
+ | 0.5132 | 111.0 | 220668 | 2.2616 |
160
+ | 0.5299 | 112.0 | 222656 | 2.2685 |
161
+ | 0.5136 | 113.0 | 224644 | 2.2475 |
162
+ | 0.5289 | 114.0 | 226632 | 2.2411 |
163
+ | 0.5141 | 115.0 | 228620 | 2.2608 |
164
+ | 0.5169 | 116.0 | 230608 | 2.2724 |
165
+ | 0.509 | 117.0 | 232596 | 2.2502 |
166
+ | 0.5057 | 118.0 | 234584 | 2.1996 |
167
+ | 0.5131 | 119.0 | 236572 | 2.2038 |
168
+ | 0.5034 | 120.0 | 238560 | 2.1845 |
169
+ | 0.5081 | 121.0 | 240548 | 2.2009 |
170
+ | 0.5029 | 122.0 | 242536 | 2.2091 |
171
+ | 0.4968 | 123.0 | 244524 | 2.1905 |
172
+ | 0.508 | 124.0 | 246512 | 2.1822 |
173
+ | 0.5001 | 125.0 | 248500 | 2.1595 |
174
+ | 0.4891 | 126.0 | 250488 | 2.1897 |
175
+ | 0.4946 | 127.0 | 252476 | 2.1402 |
176
+ | 0.4973 | 128.0 | 254464 | 2.1588 |
177
+ | 0.509 | 129.0 | 256452 | 2.1037 |
178
+ | 0.4889 | 130.0 | 258440 | 2.1252 |
179
+ | 0.4905 | 131.0 | 260428 | 2.1261 |
180
+ | 0.5083 | 132.0 | 262416 | 2.1490 |
181
+ | 0.495 | 133.0 | 264404 | 2.0943 |
182
+ | 0.5042 | 134.0 | 266392 | 2.1391 |
183
+ | 0.4951 | 135.0 | 268380 | 2.1217 |
184
+ | 0.4982 | 136.0 | 270368 | 2.1177 |
185
+ | 0.4974 | 137.0 | 272356 | 2.0817 |
186
+ | 0.4858 | 138.0 | 274344 | 2.0655 |
187
+ | 0.4841 | 139.0 | 276332 | 2.0900 |
188
+ | 0.487 | 140.0 | 278320 | 2.0710 |
189
+ | 0.4959 | 141.0 | 280308 | 2.0408 |
190
+ | 0.4861 | 142.0 | 282296 | 2.0847 |
191
+ | 0.4842 | 143.0 | 284284 | 2.0618 |
192
+ | 0.4858 | 144.0 | 286272 | 2.0394 |
193
+ | 0.4906 | 145.0 | 288260 | 2.0502 |
194
+ | 0.4959 | 146.0 | 290248 | 2.0793 |
195
+ | 0.4864 | 147.0 | 292236 | 2.0420 |
196
+ | 0.4848 | 148.0 | 294224 | 2.0643 |
197
+ | 0.4965 | 149.0 | 296212 | 2.0585 |
198
 
199
 
200
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3e99018c8fff56732c81d699ad8cd8eefaa60709de16a15081fdc8760235c2d0
3
  size 246418472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85236b66b083fc139a05461b518c00ace87d60c87e29a39190856d2bd393399e
3
  size 246418472
runs/Jan06_21-33-12_Lees-MacBook-Pro.local/events.out.tfevents.1736224392.Lees-MacBook-Pro.local ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8073dc4f3ceb898ffcce09c2684ed8937537b665444205d06bb32a832641791f
3
+ size 142129
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:04ad3d774f786b4d77a5dacc1898f6611a58b91d370e43b98f9d54734d371079
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:131e46257eaf32050b2c489aea295bf384ffae54f5c964c4998a421e2df70357
3
  size 5432