Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697556015.bce904bcef33.2251.0 +3 -0
- test.tsv +0 -0
- training.log +236 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3f7fbec72c4c30d8dca6203ce637d99a47de7ffc2da743c4ae0335b7dea321e7
|
3 |
+
size 440941957
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 15:21:29 0.0000 0.3901 0.1134 0.8550 0.5909 0.6988 0.5406
|
3 |
+
2 15:22:43 0.0000 0.0928 0.0980 0.8694 0.6808 0.7636 0.6229
|
4 |
+
3 15:23:58 0.0000 0.0679 0.0724 0.8808 0.8399 0.8599 0.7612
|
5 |
+
4 15:25:12 0.0000 0.0513 0.0875 0.8631 0.8399 0.8513 0.7521
|
6 |
+
5 15:26:27 0.0000 0.0423 0.1164 0.8706 0.7087 0.7813 0.6484
|
7 |
+
6 15:27:41 0.0000 0.0422 0.1275 0.8547 0.7841 0.8179 0.6995
|
8 |
+
7 15:28:54 0.0000 0.0241 0.1313 0.8538 0.7965 0.8242 0.7093
|
9 |
+
8 15:30:08 0.0000 0.0360 0.1406 0.8753 0.8048 0.8385 0.7321
|
10 |
+
9 15:31:22 0.0000 0.0215 0.1587 0.7983 0.7934 0.7959 0.6767
|
11 |
+
10 15:32:35 0.0000 0.0191 0.1475 0.8425 0.8068 0.8243 0.7198
|
runs/events.out.tfevents.1697556015.bce904bcef33.2251.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cd57e1a2f8c4129d7c57d374753cac94583b392ccca33ce5a1ac7c98cd71c68e
|
3 |
+
size 808480
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,236 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-17 15:20:15,009 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-17 15:20:15,010 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): ElectraModel(
|
5 |
+
(embeddings): ElectraEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): ElectraEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x ElectraLayer(
|
15 |
+
(attention): ElectraAttention(
|
16 |
+
(self): ElectraSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): ElectraSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): ElectraIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): ElectraOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
)
|
41 |
+
)
|
42 |
+
(locked_dropout): LockedDropout(p=0.5)
|
43 |
+
(linear): Linear(in_features=768, out_features=13, bias=True)
|
44 |
+
(loss_function): CrossEntropyLoss()
|
45 |
+
)"
|
46 |
+
2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
|
47 |
+
2023-10-17 15:20:15,010 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
48 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
49 |
+
2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
|
50 |
+
2023-10-17 15:20:15,010 Train: 5777 sentences
|
51 |
+
2023-10-17 15:20:15,011 (train_with_dev=False, train_with_test=False)
|
52 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-17 15:20:15,011 Training Params:
|
54 |
+
2023-10-17 15:20:15,011 - learning_rate: "3e-05"
|
55 |
+
2023-10-17 15:20:15,011 - mini_batch_size: "4"
|
56 |
+
2023-10-17 15:20:15,011 - max_epochs: "10"
|
57 |
+
2023-10-17 15:20:15,011 - shuffle: "True"
|
58 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
59 |
+
2023-10-17 15:20:15,011 Plugins:
|
60 |
+
2023-10-17 15:20:15,011 - TensorboardLogger
|
61 |
+
2023-10-17 15:20:15,011 - LinearScheduler | warmup_fraction: '0.1'
|
62 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-17 15:20:15,011 Final evaluation on model from best epoch (best-model.pt)
|
64 |
+
2023-10-17 15:20:15,011 - metric: "('micro avg', 'f1-score')"
|
65 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-17 15:20:15,011 Computation:
|
67 |
+
2023-10-17 15:20:15,011 - compute on device: cuda:0
|
68 |
+
2023-10-17 15:20:15,011 - embedding storage: none
|
69 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-17 15:20:15,011 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
|
71 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
72 |
+
2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-17 15:20:15,011 Logging anything other than scalars to TensorBoard is currently not supported.
|
74 |
+
2023-10-17 15:20:23,073 epoch 1 - iter 144/1445 - loss 2.33324949 - time (sec): 8.06 - samples/sec: 2302.06 - lr: 0.000003 - momentum: 0.000000
|
75 |
+
2023-10-17 15:20:30,031 epoch 1 - iter 288/1445 - loss 1.40669834 - time (sec): 15.02 - samples/sec: 2293.24 - lr: 0.000006 - momentum: 0.000000
|
76 |
+
2023-10-17 15:20:37,277 epoch 1 - iter 432/1445 - loss 0.99311609 - time (sec): 22.26 - samples/sec: 2340.14 - lr: 0.000009 - momentum: 0.000000
|
77 |
+
2023-10-17 15:20:44,228 epoch 1 - iter 576/1445 - loss 0.79532458 - time (sec): 29.22 - samples/sec: 2337.21 - lr: 0.000012 - momentum: 0.000000
|
78 |
+
2023-10-17 15:20:51,036 epoch 1 - iter 720/1445 - loss 0.66249281 - time (sec): 36.02 - samples/sec: 2401.24 - lr: 0.000015 - momentum: 0.000000
|
79 |
+
2023-10-17 15:20:58,193 epoch 1 - iter 864/1445 - loss 0.56791320 - time (sec): 43.18 - samples/sec: 2443.46 - lr: 0.000018 - momentum: 0.000000
|
80 |
+
2023-10-17 15:21:05,065 epoch 1 - iter 1008/1445 - loss 0.50430390 - time (sec): 50.05 - samples/sec: 2460.17 - lr: 0.000021 - momentum: 0.000000
|
81 |
+
2023-10-17 15:21:12,157 epoch 1 - iter 1152/1445 - loss 0.45583800 - time (sec): 57.14 - samples/sec: 2469.61 - lr: 0.000024 - momentum: 0.000000
|
82 |
+
2023-10-17 15:21:19,146 epoch 1 - iter 1296/1445 - loss 0.42025604 - time (sec): 64.13 - samples/sec: 2471.73 - lr: 0.000027 - momentum: 0.000000
|
83 |
+
2023-10-17 15:21:26,123 epoch 1 - iter 1440/1445 - loss 0.39100512 - time (sec): 71.11 - samples/sec: 2470.36 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-17 15:21:26,351 ----------------------------------------------------------------------------------------------------
|
85 |
+
2023-10-17 15:21:26,351 EPOCH 1 done: loss 0.3901 - lr: 0.000030
|
86 |
+
2023-10-17 15:21:29,006 DEV : loss 0.1134478896856308 - f1-score (micro avg) 0.6988
|
87 |
+
2023-10-17 15:21:29,021 saving best model
|
88 |
+
2023-10-17 15:21:29,419 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-17 15:21:36,099 epoch 2 - iter 144/1445 - loss 0.11423400 - time (sec): 6.68 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000
|
90 |
+
2023-10-17 15:21:42,808 epoch 2 - iter 288/1445 - loss 0.11006024 - time (sec): 13.39 - samples/sec: 2534.73 - lr: 0.000029 - momentum: 0.000000
|
91 |
+
2023-10-17 15:21:49,527 epoch 2 - iter 432/1445 - loss 0.10104983 - time (sec): 20.11 - samples/sec: 2552.67 - lr: 0.000029 - momentum: 0.000000
|
92 |
+
2023-10-17 15:21:56,423 epoch 2 - iter 576/1445 - loss 0.09617376 - time (sec): 27.00 - samples/sec: 2535.33 - lr: 0.000029 - momentum: 0.000000
|
93 |
+
2023-10-17 15:22:03,695 epoch 2 - iter 720/1445 - loss 0.09369082 - time (sec): 34.27 - samples/sec: 2531.86 - lr: 0.000028 - momentum: 0.000000
|
94 |
+
2023-10-17 15:22:11,248 epoch 2 - iter 864/1445 - loss 0.09042169 - time (sec): 41.83 - samples/sec: 2531.47 - lr: 0.000028 - momentum: 0.000000
|
95 |
+
2023-10-17 15:22:18,355 epoch 2 - iter 1008/1445 - loss 0.09025166 - time (sec): 48.93 - samples/sec: 2507.99 - lr: 0.000028 - momentum: 0.000000
|
96 |
+
2023-10-17 15:22:25,543 epoch 2 - iter 1152/1445 - loss 0.09048999 - time (sec): 56.12 - samples/sec: 2502.98 - lr: 0.000027 - momentum: 0.000000
|
97 |
+
2023-10-17 15:22:32,470 epoch 2 - iter 1296/1445 - loss 0.09136875 - time (sec): 63.05 - samples/sec: 2497.21 - lr: 0.000027 - momentum: 0.000000
|
98 |
+
2023-10-17 15:22:39,747 epoch 2 - iter 1440/1445 - loss 0.09254084 - time (sec): 70.33 - samples/sec: 2499.02 - lr: 0.000027 - momentum: 0.000000
|
99 |
+
2023-10-17 15:22:39,976 ----------------------------------------------------------------------------------------------------
|
100 |
+
2023-10-17 15:22:39,977 EPOCH 2 done: loss 0.0928 - lr: 0.000027
|
101 |
+
2023-10-17 15:22:43,507 DEV : loss 0.09797008335590363 - f1-score (micro avg) 0.7636
|
102 |
+
2023-10-17 15:22:43,523 saving best model
|
103 |
+
2023-10-17 15:22:44,069 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-17 15:22:51,178 epoch 3 - iter 144/1445 - loss 0.07299256 - time (sec): 7.11 - samples/sec: 2444.22 - lr: 0.000026 - momentum: 0.000000
|
105 |
+
2023-10-17 15:22:57,977 epoch 3 - iter 288/1445 - loss 0.06634279 - time (sec): 13.91 - samples/sec: 2490.30 - lr: 0.000026 - momentum: 0.000000
|
106 |
+
2023-10-17 15:23:05,074 epoch 3 - iter 432/1445 - loss 0.06581683 - time (sec): 21.00 - samples/sec: 2551.84 - lr: 0.000026 - momentum: 0.000000
|
107 |
+
2023-10-17 15:23:11,936 epoch 3 - iter 576/1445 - loss 0.07201697 - time (sec): 27.87 - samples/sec: 2538.19 - lr: 0.000025 - momentum: 0.000000
|
108 |
+
2023-10-17 15:23:18,903 epoch 3 - iter 720/1445 - loss 0.07104181 - time (sec): 34.83 - samples/sec: 2511.65 - lr: 0.000025 - momentum: 0.000000
|
109 |
+
2023-10-17 15:23:25,927 epoch 3 - iter 864/1445 - loss 0.07023237 - time (sec): 41.86 - samples/sec: 2516.01 - lr: 0.000025 - momentum: 0.000000
|
110 |
+
2023-10-17 15:23:33,061 epoch 3 - iter 1008/1445 - loss 0.06913832 - time (sec): 48.99 - samples/sec: 2490.96 - lr: 0.000024 - momentum: 0.000000
|
111 |
+
2023-10-17 15:23:40,153 epoch 3 - iter 1152/1445 - loss 0.06852697 - time (sec): 56.08 - samples/sec: 2486.94 - lr: 0.000024 - momentum: 0.000000
|
112 |
+
2023-10-17 15:23:47,600 epoch 3 - iter 1296/1445 - loss 0.06920774 - time (sec): 63.53 - samples/sec: 2481.25 - lr: 0.000024 - momentum: 0.000000
|
113 |
+
2023-10-17 15:23:54,842 epoch 3 - iter 1440/1445 - loss 0.06776713 - time (sec): 70.77 - samples/sec: 2483.99 - lr: 0.000023 - momentum: 0.000000
|
114 |
+
2023-10-17 15:23:55,074 ----------------------------------------------------------------------------------------------------
|
115 |
+
2023-10-17 15:23:55,074 EPOCH 3 done: loss 0.0679 - lr: 0.000023
|
116 |
+
2023-10-17 15:23:58,305 DEV : loss 0.07238871604204178 - f1-score (micro avg) 0.8599
|
117 |
+
2023-10-17 15:23:58,320 saving best model
|
118 |
+
2023-10-17 15:23:58,869 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-17 15:24:05,938 epoch 4 - iter 144/1445 - loss 0.03825652 - time (sec): 7.07 - samples/sec: 2582.66 - lr: 0.000023 - momentum: 0.000000
|
120 |
+
2023-10-17 15:24:13,011 epoch 4 - iter 288/1445 - loss 0.05121297 - time (sec): 14.14 - samples/sec: 2515.02 - lr: 0.000023 - momentum: 0.000000
|
121 |
+
2023-10-17 15:24:20,248 epoch 4 - iter 432/1445 - loss 0.04707007 - time (sec): 21.38 - samples/sec: 2475.52 - lr: 0.000022 - momentum: 0.000000
|
122 |
+
2023-10-17 15:24:27,099 epoch 4 - iter 576/1445 - loss 0.04864143 - time (sec): 28.23 - samples/sec: 2490.67 - lr: 0.000022 - momentum: 0.000000
|
123 |
+
2023-10-17 15:24:34,071 epoch 4 - iter 720/1445 - loss 0.05041580 - time (sec): 35.20 - samples/sec: 2469.04 - lr: 0.000022 - momentum: 0.000000
|
124 |
+
2023-10-17 15:24:41,242 epoch 4 - iter 864/1445 - loss 0.05057361 - time (sec): 42.37 - samples/sec: 2466.87 - lr: 0.000021 - momentum: 0.000000
|
125 |
+
2023-10-17 15:24:48,213 epoch 4 - iter 1008/1445 - loss 0.04981836 - time (sec): 49.34 - samples/sec: 2478.95 - lr: 0.000021 - momentum: 0.000000
|
126 |
+
2023-10-17 15:24:54,867 epoch 4 - iter 1152/1445 - loss 0.05022101 - time (sec): 56.00 - samples/sec: 2498.44 - lr: 0.000021 - momentum: 0.000000
|
127 |
+
2023-10-17 15:25:01,564 epoch 4 - iter 1296/1445 - loss 0.05005487 - time (sec): 62.69 - samples/sec: 2514.15 - lr: 0.000020 - momentum: 0.000000
|
128 |
+
2023-10-17 15:25:08,655 epoch 4 - iter 1440/1445 - loss 0.05140703 - time (sec): 69.78 - samples/sec: 2519.10 - lr: 0.000020 - momentum: 0.000000
|
129 |
+
2023-10-17 15:25:08,886 ----------------------------------------------------------------------------------------------------
|
130 |
+
2023-10-17 15:25:08,886 EPOCH 4 done: loss 0.0513 - lr: 0.000020
|
131 |
+
2023-10-17 15:25:12,521 DEV : loss 0.08748035877943039 - f1-score (micro avg) 0.8513
|
132 |
+
2023-10-17 15:25:12,536 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-17 15:25:20,019 epoch 5 - iter 144/1445 - loss 0.02484292 - time (sec): 7.48 - samples/sec: 2365.34 - lr: 0.000020 - momentum: 0.000000
|
134 |
+
2023-10-17 15:25:27,347 epoch 5 - iter 288/1445 - loss 0.02639251 - time (sec): 14.81 - samples/sec: 2426.84 - lr: 0.000019 - momentum: 0.000000
|
135 |
+
2023-10-17 15:25:34,576 epoch 5 - iter 432/1445 - loss 0.03062150 - time (sec): 22.04 - samples/sec: 2438.65 - lr: 0.000019 - momentum: 0.000000
|
136 |
+
2023-10-17 15:25:41,500 epoch 5 - iter 576/1445 - loss 0.03478141 - time (sec): 28.96 - samples/sec: 2433.14 - lr: 0.000019 - momentum: 0.000000
|
137 |
+
2023-10-17 15:25:48,647 epoch 5 - iter 720/1445 - loss 0.03435012 - time (sec): 36.11 - samples/sec: 2431.81 - lr: 0.000018 - momentum: 0.000000
|
138 |
+
2023-10-17 15:25:55,712 epoch 5 - iter 864/1445 - loss 0.03915073 - time (sec): 43.17 - samples/sec: 2429.56 - lr: 0.000018 - momentum: 0.000000
|
139 |
+
2023-10-17 15:26:02,874 epoch 5 - iter 1008/1445 - loss 0.04019423 - time (sec): 50.34 - samples/sec: 2424.89 - lr: 0.000018 - momentum: 0.000000
|
140 |
+
2023-10-17 15:26:09,941 epoch 5 - iter 1152/1445 - loss 0.04137874 - time (sec): 57.40 - samples/sec: 2449.49 - lr: 0.000017 - momentum: 0.000000
|
141 |
+
2023-10-17 15:26:16,823 epoch 5 - iter 1296/1445 - loss 0.04231658 - time (sec): 64.29 - samples/sec: 2455.18 - lr: 0.000017 - momentum: 0.000000
|
142 |
+
2023-10-17 15:26:24,165 epoch 5 - iter 1440/1445 - loss 0.04229848 - time (sec): 71.63 - samples/sec: 2452.34 - lr: 0.000017 - momentum: 0.000000
|
143 |
+
2023-10-17 15:26:24,411 ----------------------------------------------------------------------------------------------------
|
144 |
+
2023-10-17 15:26:24,411 EPOCH 5 done: loss 0.0423 - lr: 0.000017
|
145 |
+
2023-10-17 15:26:27,792 DEV : loss 0.11643949151039124 - f1-score (micro avg) 0.7813
|
146 |
+
2023-10-17 15:26:27,810 ----------------------------------------------------------------------------------------------------
|
147 |
+
2023-10-17 15:26:35,160 epoch 6 - iter 144/1445 - loss 0.04407312 - time (sec): 7.35 - samples/sec: 2381.66 - lr: 0.000016 - momentum: 0.000000
|
148 |
+
2023-10-17 15:26:42,213 epoch 6 - iter 288/1445 - loss 0.07041699 - time (sec): 14.40 - samples/sec: 2363.33 - lr: 0.000016 - momentum: 0.000000
|
149 |
+
2023-10-17 15:26:49,199 epoch 6 - iter 432/1445 - loss 0.06543099 - time (sec): 21.39 - samples/sec: 2412.35 - lr: 0.000016 - momentum: 0.000000
|
150 |
+
2023-10-17 15:26:56,329 epoch 6 - iter 576/1445 - loss 0.05494049 - time (sec): 28.52 - samples/sec: 2447.24 - lr: 0.000015 - momentum: 0.000000
|
151 |
+
2023-10-17 15:27:03,461 epoch 6 - iter 720/1445 - loss 0.05313692 - time (sec): 35.65 - samples/sec: 2468.33 - lr: 0.000015 - momentum: 0.000000
|
152 |
+
2023-10-17 15:27:10,213 epoch 6 - iter 864/1445 - loss 0.04946118 - time (sec): 42.40 - samples/sec: 2462.88 - lr: 0.000015 - momentum: 0.000000
|
153 |
+
2023-10-17 15:27:16,992 epoch 6 - iter 1008/1445 - loss 0.04715937 - time (sec): 49.18 - samples/sec: 2492.01 - lr: 0.000014 - momentum: 0.000000
|
154 |
+
2023-10-17 15:27:23,900 epoch 6 - iter 1152/1445 - loss 0.04549664 - time (sec): 56.09 - samples/sec: 2479.47 - lr: 0.000014 - momentum: 0.000000
|
155 |
+
2023-10-17 15:27:30,845 epoch 6 - iter 1296/1445 - loss 0.04407099 - time (sec): 63.03 - samples/sec: 2486.69 - lr: 0.000014 - momentum: 0.000000
|
156 |
+
2023-10-17 15:27:37,799 epoch 6 - iter 1440/1445 - loss 0.04231390 - time (sec): 69.99 - samples/sec: 2507.53 - lr: 0.000013 - momentum: 0.000000
|
157 |
+
2023-10-17 15:27:38,026 ----------------------------------------------------------------------------------------------------
|
158 |
+
2023-10-17 15:27:38,026 EPOCH 6 done: loss 0.0422 - lr: 0.000013
|
159 |
+
2023-10-17 15:27:41,262 DEV : loss 0.12753015756607056 - f1-score (micro avg) 0.8179
|
160 |
+
2023-10-17 15:27:41,277 ----------------------------------------------------------------------------------------------------
|
161 |
+
2023-10-17 15:27:48,074 epoch 7 - iter 144/1445 - loss 0.02864804 - time (sec): 6.80 - samples/sec: 2541.40 - lr: 0.000013 - momentum: 0.000000
|
162 |
+
2023-10-17 15:27:54,732 epoch 7 - iter 288/1445 - loss 0.02850202 - time (sec): 13.45 - samples/sec: 2529.29 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-17 15:28:01,745 epoch 7 - iter 432/1445 - loss 0.02700154 - time (sec): 20.47 - samples/sec: 2538.36 - lr: 0.000012 - momentum: 0.000000
|
164 |
+
2023-10-17 15:28:09,285 epoch 7 - iter 576/1445 - loss 0.02685090 - time (sec): 28.01 - samples/sec: 2500.37 - lr: 0.000012 - momentum: 0.000000
|
165 |
+
2023-10-17 15:28:16,228 epoch 7 - iter 720/1445 - loss 0.02631576 - time (sec): 34.95 - samples/sec: 2498.86 - lr: 0.000012 - momentum: 0.000000
|
166 |
+
2023-10-17 15:28:23,229 epoch 7 - iter 864/1445 - loss 0.02503229 - time (sec): 41.95 - samples/sec: 2527.10 - lr: 0.000011 - momentum: 0.000000
|
167 |
+
2023-10-17 15:28:30,267 epoch 7 - iter 1008/1445 - loss 0.02322476 - time (sec): 48.99 - samples/sec: 2513.53 - lr: 0.000011 - momentum: 0.000000
|
168 |
+
2023-10-17 15:28:37,208 epoch 7 - iter 1152/1445 - loss 0.02262618 - time (sec): 55.93 - samples/sec: 2507.14 - lr: 0.000011 - momentum: 0.000000
|
169 |
+
2023-10-17 15:28:44,333 epoch 7 - iter 1296/1445 - loss 0.02359839 - time (sec): 63.05 - samples/sec: 2507.22 - lr: 0.000010 - momentum: 0.000000
|
170 |
+
2023-10-17 15:28:51,234 epoch 7 - iter 1440/1445 - loss 0.02410498 - time (sec): 69.96 - samples/sec: 2512.80 - lr: 0.000010 - momentum: 0.000000
|
171 |
+
2023-10-17 15:28:51,455 ----------------------------------------------------------------------------------------------------
|
172 |
+
2023-10-17 15:28:51,456 EPOCH 7 done: loss 0.0241 - lr: 0.000010
|
173 |
+
2023-10-17 15:28:54,762 DEV : loss 0.13132773339748383 - f1-score (micro avg) 0.8242
|
174 |
+
2023-10-17 15:28:54,780 ----------------------------------------------------------------------------------------------------
|
175 |
+
2023-10-17 15:29:01,739 epoch 8 - iter 144/1445 - loss 0.02599645 - time (sec): 6.96 - samples/sec: 2320.98 - lr: 0.000010 - momentum: 0.000000
|
176 |
+
2023-10-17 15:29:09,413 epoch 8 - iter 288/1445 - loss 0.03908539 - time (sec): 14.63 - samples/sec: 2361.44 - lr: 0.000009 - momentum: 0.000000
|
177 |
+
2023-10-17 15:29:16,244 epoch 8 - iter 432/1445 - loss 0.04312177 - time (sec): 21.46 - samples/sec: 2436.90 - lr: 0.000009 - momentum: 0.000000
|
178 |
+
2023-10-17 15:29:23,284 epoch 8 - iter 576/1445 - loss 0.04654217 - time (sec): 28.50 - samples/sec: 2422.07 - lr: 0.000009 - momentum: 0.000000
|
179 |
+
2023-10-17 15:29:30,118 epoch 8 - iter 720/1445 - loss 0.04836767 - time (sec): 35.34 - samples/sec: 2436.55 - lr: 0.000008 - momentum: 0.000000
|
180 |
+
2023-10-17 15:29:37,165 epoch 8 - iter 864/1445 - loss 0.04435024 - time (sec): 42.38 - samples/sec: 2467.42 - lr: 0.000008 - momentum: 0.000000
|
181 |
+
2023-10-17 15:29:44,054 epoch 8 - iter 1008/1445 - loss 0.04368948 - time (sec): 49.27 - samples/sec: 2483.93 - lr: 0.000008 - momentum: 0.000000
|
182 |
+
2023-10-17 15:29:51,080 epoch 8 - iter 1152/1445 - loss 0.04126688 - time (sec): 56.30 - samples/sec: 2473.99 - lr: 0.000007 - momentum: 0.000000
|
183 |
+
2023-10-17 15:29:58,174 epoch 8 - iter 1296/1445 - loss 0.03803387 - time (sec): 63.39 - samples/sec: 2489.99 - lr: 0.000007 - momentum: 0.000000
|
184 |
+
2023-10-17 15:30:05,225 epoch 8 - iter 1440/1445 - loss 0.03611431 - time (sec): 70.44 - samples/sec: 2491.01 - lr: 0.000007 - momentum: 0.000000
|
185 |
+
2023-10-17 15:30:05,479 ----------------------------------------------------------------------------------------------------
|
186 |
+
2023-10-17 15:30:05,479 EPOCH 8 done: loss 0.0360 - lr: 0.000007
|
187 |
+
2023-10-17 15:30:08,822 DEV : loss 0.14063192903995514 - f1-score (micro avg) 0.8385
|
188 |
+
2023-10-17 15:30:08,840 ----------------------------------------------------------------------------------------------------
|
189 |
+
2023-10-17 15:30:16,101 epoch 9 - iter 144/1445 - loss 0.00928856 - time (sec): 7.26 - samples/sec: 2645.70 - lr: 0.000006 - momentum: 0.000000
|
190 |
+
2023-10-17 15:30:22,830 epoch 9 - iter 288/1445 - loss 0.01137305 - time (sec): 13.99 - samples/sec: 2509.03 - lr: 0.000006 - momentum: 0.000000
|
191 |
+
2023-10-17 15:30:30,015 epoch 9 - iter 432/1445 - loss 0.01191848 - time (sec): 21.17 - samples/sec: 2557.99 - lr: 0.000006 - momentum: 0.000000
|
192 |
+
2023-10-17 15:30:37,152 epoch 9 - iter 576/1445 - loss 0.01374329 - time (sec): 28.31 - samples/sec: 2558.62 - lr: 0.000005 - momentum: 0.000000
|
193 |
+
2023-10-17 15:30:44,261 epoch 9 - iter 720/1445 - loss 0.01533254 - time (sec): 35.42 - samples/sec: 2528.74 - lr: 0.000005 - momentum: 0.000000
|
194 |
+
2023-10-17 15:30:51,281 epoch 9 - iter 864/1445 - loss 0.01618231 - time (sec): 42.44 - samples/sec: 2487.04 - lr: 0.000005 - momentum: 0.000000
|
195 |
+
2023-10-17 15:30:58,224 epoch 9 - iter 1008/1445 - loss 0.01852939 - time (sec): 49.38 - samples/sec: 2494.62 - lr: 0.000004 - momentum: 0.000000
|
196 |
+
2023-10-17 15:31:05,713 epoch 9 - iter 1152/1445 - loss 0.02014171 - time (sec): 56.87 - samples/sec: 2484.98 - lr: 0.000004 - momentum: 0.000000
|
197 |
+
2023-10-17 15:31:12,765 epoch 9 - iter 1296/1445 - loss 0.02108402 - time (sec): 63.92 - samples/sec: 2486.65 - lr: 0.000004 - momentum: 0.000000
|
198 |
+
2023-10-17 15:31:19,478 epoch 9 - iter 1440/1445 - loss 0.02161319 - time (sec): 70.64 - samples/sec: 2483.77 - lr: 0.000003 - momentum: 0.000000
|
199 |
+
2023-10-17 15:31:19,741 ----------------------------------------------------------------------------------------------------
|
200 |
+
2023-10-17 15:31:19,741 EPOCH 9 done: loss 0.0215 - lr: 0.000003
|
201 |
+
2023-10-17 15:31:22,962 DEV : loss 0.15872889757156372 - f1-score (micro avg) 0.7959
|
202 |
+
2023-10-17 15:31:22,980 ----------------------------------------------------------------------------------------------------
|
203 |
+
2023-10-17 15:31:29,837 epoch 10 - iter 144/1445 - loss 0.02704362 - time (sec): 6.86 - samples/sec: 2545.17 - lr: 0.000003 - momentum: 0.000000
|
204 |
+
2023-10-17 15:31:36,571 epoch 10 - iter 288/1445 - loss 0.03061519 - time (sec): 13.59 - samples/sec: 2579.66 - lr: 0.000003 - momentum: 0.000000
|
205 |
+
2023-10-17 15:31:43,536 epoch 10 - iter 432/1445 - loss 0.02831624 - time (sec): 20.55 - samples/sec: 2514.98 - lr: 0.000002 - momentum: 0.000000
|
206 |
+
2023-10-17 15:31:50,270 epoch 10 - iter 576/1445 - loss 0.02567299 - time (sec): 27.29 - samples/sec: 2489.86 - lr: 0.000002 - momentum: 0.000000
|
207 |
+
2023-10-17 15:31:57,397 epoch 10 - iter 720/1445 - loss 0.02269889 - time (sec): 34.42 - samples/sec: 2518.68 - lr: 0.000002 - momentum: 0.000000
|
208 |
+
2023-10-17 15:32:04,472 epoch 10 - iter 864/1445 - loss 0.02217955 - time (sec): 41.49 - samples/sec: 2540.09 - lr: 0.000001 - momentum: 0.000000
|
209 |
+
2023-10-17 15:32:11,337 epoch 10 - iter 1008/1445 - loss 0.02100455 - time (sec): 48.36 - samples/sec: 2523.83 - lr: 0.000001 - momentum: 0.000000
|
210 |
+
2023-10-17 15:32:18,547 epoch 10 - iter 1152/1445 - loss 0.02030918 - time (sec): 55.57 - samples/sec: 2521.36 - lr: 0.000001 - momentum: 0.000000
|
211 |
+
2023-10-17 15:32:25,459 epoch 10 - iter 1296/1445 - loss 0.01984945 - time (sec): 62.48 - samples/sec: 2530.72 - lr: 0.000000 - momentum: 0.000000
|
212 |
+
2023-10-17 15:32:32,454 epoch 10 - iter 1440/1445 - loss 0.01908284 - time (sec): 69.47 - samples/sec: 2529.70 - lr: 0.000000 - momentum: 0.000000
|
213 |
+
2023-10-17 15:32:32,675 ----------------------------------------------------------------------------------------------------
|
214 |
+
2023-10-17 15:32:32,675 EPOCH 10 done: loss 0.0191 - lr: 0.000000
|
215 |
+
2023-10-17 15:32:35,914 DEV : loss 0.14746923744678497 - f1-score (micro avg) 0.8243
|
216 |
+
2023-10-17 15:32:36,350 ----------------------------------------------------------------------------------------------------
|
217 |
+
2023-10-17 15:32:36,352 Loading model from best epoch ...
|
218 |
+
2023-10-17 15:32:38,127 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
219 |
+
2023-10-17 15:32:40,946
|
220 |
+
Results:
|
221 |
+
- F-score (micro) 0.8538
|
222 |
+
- F-score (macro) 0.7349
|
223 |
+
- Accuracy 0.7522
|
224 |
+
|
225 |
+
By class:
|
226 |
+
precision recall f1-score support
|
227 |
+
|
228 |
+
PER 0.8758 0.8340 0.8544 482
|
229 |
+
LOC 0.9385 0.8996 0.9186 458
|
230 |
+
ORG 0.4286 0.4348 0.4317 69
|
231 |
+
|
232 |
+
micro avg 0.8719 0.8365 0.8538 1009
|
233 |
+
macro avg 0.7476 0.7228 0.7349 1009
|
234 |
+
weighted avg 0.8737 0.8365 0.8546 1009
|
235 |
+
|
236 |
+
2023-10-17 15:32:40,946 ----------------------------------------------------------------------------------------------------
|