End of training
Browse files
README.md
CHANGED
@@ -9,12 +9,12 @@ model-index:
|
|
9 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
10 |
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
-
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/thalesian-university-of-new-mexico-press/huggingface/runs/
|
13 |
# train_2
|
14 |
|
15 |
This model was trained from scratch on the None dataset.
|
16 |
It achieves the following results on the evaluation set:
|
17 |
-
- Loss:
|
18 |
|
19 |
## Model description
|
20 |
|
@@ -33,47 +33,168 @@ More information needed
|
|
33 |
### Training hyperparameters
|
34 |
|
35 |
The following hyperparameters were used during training:
|
36 |
-
- learning_rate: 5e-
|
37 |
- train_batch_size: 32
|
38 |
- eval_batch_size: 32
|
39 |
- seed: 42
|
40 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
41 |
-
- lr_scheduler_type:
|
42 |
-
- lr_scheduler_warmup_steps:
|
43 |
-
- num_epochs:
|
44 |
|
45 |
### Training results
|
46 |
|
47 |
-
| Training Loss | Epoch | Step
|
48 |
-
|
49 |
-
|
|
50 |
-
|
|
51 |
-
|
|
52 |
-
|
|
53 |
-
|
|
54 |
-
|
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
77 |
|
78 |
|
79 |
### Framework versions
|
|
|
9 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
10 |
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
+
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/thalesian-university-of-new-mexico-press/huggingface/runs/5dytrarp)
|
13 |
# train_2
|
14 |
|
15 |
This model was trained from scratch on the None dataset.
|
16 |
It achieves the following results on the evaluation set:
|
17 |
+
- Loss: 2.0585
|
18 |
|
19 |
## Model description
|
20 |
|
|
|
33 |
### Training hyperparameters
|
34 |
|
35 |
The following hyperparameters were used during training:
|
36 |
+
- learning_rate: 5e-08
|
37 |
- train_batch_size: 32
|
38 |
- eval_batch_size: 32
|
39 |
- seed: 42
|
40 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
41 |
+
- lr_scheduler_type: inverse_sqrt
|
42 |
+
- lr_scheduler_warmup_steps: 10000
|
43 |
+
- num_epochs: 1000
|
44 |
|
45 |
### Training results
|
46 |
|
47 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
48 |
+
|:-------------:|:-----:|:------:|:---------------:|
|
49 |
+
| 1.2945 | 1.0 | 1988 | 8.0900 |
|
50 |
+
| 1.2924 | 2.0 | 3976 | 8.0917 |
|
51 |
+
| 1.2384 | 3.0 | 5964 | 7.8635 |
|
52 |
+
| 1.1923 | 4.0 | 7952 | 7.5532 |
|
53 |
+
| 1.1749 | 5.0 | 9940 | 7.0878 |
|
54 |
+
| 1.1125 | 6.0 | 11928 | 6.7368 |
|
55 |
+
| 1.0792 | 7.0 | 13916 | 6.4160 |
|
56 |
+
| 1.0603 | 8.0 | 15904 | 6.1504 |
|
57 |
+
| 1.0207 | 9.0 | 17892 | 5.8524 |
|
58 |
+
| 0.9642 | 10.0 | 19880 | 5.5657 |
|
59 |
+
| 0.9807 | 11.0 | 21868 | 5.3785 |
|
60 |
+
| 0.9153 | 12.0 | 23856 | 5.4038 |
|
61 |
+
| 0.9148 | 13.0 | 25844 | 5.1502 |
|
62 |
+
| 0.9351 | 14.0 | 27832 | 5.0736 |
|
63 |
+
| 0.9018 | 15.0 | 29820 | 4.9524 |
|
64 |
+
| 0.8817 | 16.0 | 31808 | 4.8684 |
|
65 |
+
| 0.8734 | 17.0 | 33796 | 4.8746 |
|
66 |
+
| 0.8526 | 18.0 | 35784 | 4.7093 |
|
67 |
+
| 0.8478 | 19.0 | 37772 | 4.6303 |
|
68 |
+
| 0.8323 | 20.0 | 39760 | 4.4704 |
|
69 |
+
| 0.8272 | 21.0 | 41748 | 4.4963 |
|
70 |
+
| 0.8388 | 22.0 | 43736 | 4.2419 |
|
71 |
+
| 0.8082 | 23.0 | 45724 | 4.3082 |
|
72 |
+
| 0.7752 | 24.0 | 47712 | 4.2786 |
|
73 |
+
| 0.7942 | 25.0 | 49700 | 4.1537 |
|
74 |
+
| 0.774 | 26.0 | 51688 | 4.0926 |
|
75 |
+
| 0.7832 | 27.0 | 53676 | 4.0640 |
|
76 |
+
| 0.7452 | 28.0 | 55664 | 4.0172 |
|
77 |
+
| 0.7676 | 29.0 | 57652 | 3.9586 |
|
78 |
+
| 0.7311 | 30.0 | 59640 | 3.9717 |
|
79 |
+
| 0.7376 | 31.0 | 61628 | 3.8749 |
|
80 |
+
| 0.7316 | 32.0 | 63616 | 3.7216 |
|
81 |
+
| 0.7196 | 33.0 | 65604 | 3.7663 |
|
82 |
+
| 0.7285 | 34.0 | 67592 | 3.7143 |
|
83 |
+
| 0.7095 | 35.0 | 69580 | 3.6517 |
|
84 |
+
| 0.6921 | 36.0 | 71568 | 3.6316 |
|
85 |
+
| 0.6997 | 37.0 | 73556 | 3.6626 |
|
86 |
+
| 0.6903 | 38.0 | 75544 | 3.5780 |
|
87 |
+
| 0.6799 | 39.0 | 77532 | 3.5831 |
|
88 |
+
| 0.6941 | 40.0 | 79520 | 3.4966 |
|
89 |
+
| 0.6882 | 41.0 | 81508 | 3.3806 |
|
90 |
+
| 0.6874 | 42.0 | 83496 | 3.4282 |
|
91 |
+
| 0.6681 | 43.0 | 85484 | 3.3843 |
|
92 |
+
| 0.6654 | 44.0 | 87472 | 3.4136 |
|
93 |
+
| 0.6716 | 45.0 | 89460 | 3.4052 |
|
94 |
+
| 0.6688 | 46.0 | 91448 | 3.3275 |
|
95 |
+
| 0.6631 | 47.0 | 93436 | 3.3487 |
|
96 |
+
| 0.6419 | 48.0 | 95424 | 3.2867 |
|
97 |
+
| 0.6528 | 49.0 | 97412 | 3.2035 |
|
98 |
+
| 0.6427 | 50.0 | 99400 | 3.2311 |
|
99 |
+
| 0.6514 | 51.0 | 101388 | 3.1894 |
|
100 |
+
| 0.6406 | 52.0 | 103376 | 3.2018 |
|
101 |
+
| 0.6382 | 53.0 | 105364 | 3.1603 |
|
102 |
+
| 0.6282 | 54.0 | 107352 | 3.0533 |
|
103 |
+
| 0.6327 | 55.0 | 109340 | 3.0959 |
|
104 |
+
| 0.6284 | 56.0 | 111328 | 3.0642 |
|
105 |
+
| 0.6434 | 57.0 | 113316 | 3.0728 |
|
106 |
+
| 0.6105 | 58.0 | 115304 | 3.0209 |
|
107 |
+
| 0.6354 | 59.0 | 117292 | 2.9889 |
|
108 |
+
| 0.6145 | 60.0 | 119280 | 2.9779 |
|
109 |
+
| 0.6126 | 61.0 | 121268 | 2.9797 |
|
110 |
+
| 0.6126 | 62.0 | 123256 | 2.9048 |
|
111 |
+
| 0.6234 | 63.0 | 125244 | 2.9306 |
|
112 |
+
| 0.6086 | 64.0 | 127232 | 2.8959 |
|
113 |
+
| 0.6023 | 65.0 | 129220 | 2.9148 |
|
114 |
+
| 0.597 | 66.0 | 131208 | 2.8671 |
|
115 |
+
| 0.6064 | 67.0 | 133196 | 2.8509 |
|
116 |
+
| 0.598 | 68.0 | 135184 | 2.8174 |
|
117 |
+
| 0.6093 | 69.0 | 137172 | 2.7873 |
|
118 |
+
| 0.5935 | 70.0 | 139160 | 2.8055 |
|
119 |
+
| 0.5747 | 71.0 | 141148 | 2.7558 |
|
120 |
+
| 0.587 | 72.0 | 143136 | 2.7970 |
|
121 |
+
| 0.5748 | 73.0 | 145124 | 2.7178 |
|
122 |
+
| 0.5705 | 74.0 | 147112 | 2.7279 |
|
123 |
+
| 0.5729 | 75.0 | 149100 | 2.6740 |
|
124 |
+
| 0.5654 | 76.0 | 151088 | 2.6869 |
|
125 |
+
| 0.5677 | 77.0 | 153076 | 2.6599 |
|
126 |
+
| 0.5703 | 78.0 | 155064 | 2.6833 |
|
127 |
+
| 0.5553 | 79.0 | 157052 | 2.6079 |
|
128 |
+
| 0.5638 | 80.0 | 159040 | 2.5833 |
|
129 |
+
| 0.5604 | 81.0 | 161028 | 2.5944 |
|
130 |
+
| 0.5579 | 82.0 | 163016 | 2.5878 |
|
131 |
+
| 0.5486 | 83.0 | 165004 | 2.5828 |
|
132 |
+
| 0.561 | 84.0 | 166992 | 2.6022 |
|
133 |
+
| 0.5584 | 85.0 | 168980 | 2.5213 |
|
134 |
+
| 0.5582 | 86.0 | 170968 | 2.5448 |
|
135 |
+
| 0.5523 | 87.0 | 172956 | 2.5131 |
|
136 |
+
| 0.5589 | 88.0 | 174944 | 2.5138 |
|
137 |
+
| 0.548 | 89.0 | 176932 | 2.5272 |
|
138 |
+
| 0.5594 | 90.0 | 178920 | 2.5176 |
|
139 |
+
| 0.5366 | 91.0 | 180908 | 2.4901 |
|
140 |
+
| 0.5457 | 92.0 | 182896 | 2.4668 |
|
141 |
+
| 0.5457 | 93.0 | 184884 | 2.4619 |
|
142 |
+
| 0.5458 | 94.0 | 186872 | 2.4393 |
|
143 |
+
| 0.5431 | 95.0 | 188860 | 2.4425 |
|
144 |
+
| 0.5338 | 96.0 | 190848 | 2.4113 |
|
145 |
+
| 0.5432 | 97.0 | 192836 | 2.4143 |
|
146 |
+
| 0.5351 | 98.0 | 194824 | 2.3863 |
|
147 |
+
| 0.5255 | 99.0 | 196812 | 2.3990 |
|
148 |
+
| 0.5345 | 100.0 | 198800 | 2.3748 |
|
149 |
+
| 0.5305 | 101.0 | 200788 | 2.3814 |
|
150 |
+
| 0.5405 | 102.0 | 202776 | 2.3421 |
|
151 |
+
| 0.5198 | 103.0 | 204764 | 2.3486 |
|
152 |
+
| 0.5197 | 104.0 | 206752 | 2.3400 |
|
153 |
+
| 0.53 | 105.0 | 208740 | 2.3484 |
|
154 |
+
| 0.5304 | 106.0 | 210728 | 2.3230 |
|
155 |
+
| 0.5164 | 107.0 | 212716 | 2.3068 |
|
156 |
+
| 0.5227 | 108.0 | 214704 | 2.2744 |
|
157 |
+
| 0.5141 | 109.0 | 216692 | 2.3020 |
|
158 |
+
| 0.5189 | 110.0 | 218680 | 2.2691 |
|
159 |
+
| 0.5132 | 111.0 | 220668 | 2.2616 |
|
160 |
+
| 0.5299 | 112.0 | 222656 | 2.2685 |
|
161 |
+
| 0.5136 | 113.0 | 224644 | 2.2475 |
|
162 |
+
| 0.5289 | 114.0 | 226632 | 2.2411 |
|
163 |
+
| 0.5141 | 115.0 | 228620 | 2.2608 |
|
164 |
+
| 0.5169 | 116.0 | 230608 | 2.2724 |
|
165 |
+
| 0.509 | 117.0 | 232596 | 2.2502 |
|
166 |
+
| 0.5057 | 118.0 | 234584 | 2.1996 |
|
167 |
+
| 0.5131 | 119.0 | 236572 | 2.2038 |
|
168 |
+
| 0.5034 | 120.0 | 238560 | 2.1845 |
|
169 |
+
| 0.5081 | 121.0 | 240548 | 2.2009 |
|
170 |
+
| 0.5029 | 122.0 | 242536 | 2.2091 |
|
171 |
+
| 0.4968 | 123.0 | 244524 | 2.1905 |
|
172 |
+
| 0.508 | 124.0 | 246512 | 2.1822 |
|
173 |
+
| 0.5001 | 125.0 | 248500 | 2.1595 |
|
174 |
+
| 0.4891 | 126.0 | 250488 | 2.1897 |
|
175 |
+
| 0.4946 | 127.0 | 252476 | 2.1402 |
|
176 |
+
| 0.4973 | 128.0 | 254464 | 2.1588 |
|
177 |
+
| 0.509 | 129.0 | 256452 | 2.1037 |
|
178 |
+
| 0.4889 | 130.0 | 258440 | 2.1252 |
|
179 |
+
| 0.4905 | 131.0 | 260428 | 2.1261 |
|
180 |
+
| 0.5083 | 132.0 | 262416 | 2.1490 |
|
181 |
+
| 0.495 | 133.0 | 264404 | 2.0943 |
|
182 |
+
| 0.5042 | 134.0 | 266392 | 2.1391 |
|
183 |
+
| 0.4951 | 135.0 | 268380 | 2.1217 |
|
184 |
+
| 0.4982 | 136.0 | 270368 | 2.1177 |
|
185 |
+
| 0.4974 | 137.0 | 272356 | 2.0817 |
|
186 |
+
| 0.4858 | 138.0 | 274344 | 2.0655 |
|
187 |
+
| 0.4841 | 139.0 | 276332 | 2.0900 |
|
188 |
+
| 0.487 | 140.0 | 278320 | 2.0710 |
|
189 |
+
| 0.4959 | 141.0 | 280308 | 2.0408 |
|
190 |
+
| 0.4861 | 142.0 | 282296 | 2.0847 |
|
191 |
+
| 0.4842 | 143.0 | 284284 | 2.0618 |
|
192 |
+
| 0.4858 | 144.0 | 286272 | 2.0394 |
|
193 |
+
| 0.4906 | 145.0 | 288260 | 2.0502 |
|
194 |
+
| 0.4959 | 146.0 | 290248 | 2.0793 |
|
195 |
+
| 0.4864 | 147.0 | 292236 | 2.0420 |
|
196 |
+
| 0.4848 | 148.0 | 294224 | 2.0643 |
|
197 |
+
| 0.4965 | 149.0 | 296212 | 2.0585 |
|
198 |
|
199 |
|
200 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 246418472
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:85236b66b083fc139a05461b518c00ace87d60c87e29a39190856d2bd393399e
|
3 |
size 246418472
|
runs/Jan06_21-33-12_Lees-MacBook-Pro.local/events.out.tfevents.1736224392.Lees-MacBook-Pro.local
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8073dc4f3ceb898ffcce09c2684ed8937537b665444205d06bb32a832641791f
|
3 |
+
size 142129
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5432
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:131e46257eaf32050b2c489aea295bf384ffae54f5c964c4998a421e2df70357
|
3 |
size 5432
|