stefan-it commited on
Commit
d0814d6
·
1 Parent(s): fd091ca

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f7fbec72c4c30d8dca6203ce637d99a47de7ffc2da743c4ae0335b7dea321e7
3
+ size 440941957
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 15:21:29 0.0000 0.3901 0.1134 0.8550 0.5909 0.6988 0.5406
3
+ 2 15:22:43 0.0000 0.0928 0.0980 0.8694 0.6808 0.7636 0.6229
4
+ 3 15:23:58 0.0000 0.0679 0.0724 0.8808 0.8399 0.8599 0.7612
5
+ 4 15:25:12 0.0000 0.0513 0.0875 0.8631 0.8399 0.8513 0.7521
6
+ 5 15:26:27 0.0000 0.0423 0.1164 0.8706 0.7087 0.7813 0.6484
7
+ 6 15:27:41 0.0000 0.0422 0.1275 0.8547 0.7841 0.8179 0.6995
8
+ 7 15:28:54 0.0000 0.0241 0.1313 0.8538 0.7965 0.8242 0.7093
9
+ 8 15:30:08 0.0000 0.0360 0.1406 0.8753 0.8048 0.8385 0.7321
10
+ 9 15:31:22 0.0000 0.0215 0.1587 0.7983 0.7934 0.7959 0.6767
11
+ 10 15:32:35 0.0000 0.0191 0.1475 0.8425 0.8068 0.8243 0.7198
runs/events.out.tfevents.1697556015.bce904bcef33.2251.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd57e1a2f8c4129d7c57d374753cac94583b392ccca33ce5a1ac7c98cd71c68e
3
+ size 808480
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,236 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 15:20:15,009 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 15:20:15,010 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=13, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 15:20:15,010 MultiCorpus: 5777 train + 722 dev + 723 test sentences
48
+ - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
49
+ 2023-10-17 15:20:15,010 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 15:20:15,010 Train: 5777 sentences
51
+ 2023-10-17 15:20:15,011 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 15:20:15,011 Training Params:
54
+ 2023-10-17 15:20:15,011 - learning_rate: "3e-05"
55
+ 2023-10-17 15:20:15,011 - mini_batch_size: "4"
56
+ 2023-10-17 15:20:15,011 - max_epochs: "10"
57
+ 2023-10-17 15:20:15,011 - shuffle: "True"
58
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 15:20:15,011 Plugins:
60
+ 2023-10-17 15:20:15,011 - TensorboardLogger
61
+ 2023-10-17 15:20:15,011 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 15:20:15,011 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 15:20:15,011 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 15:20:15,011 Computation:
67
+ 2023-10-17 15:20:15,011 - compute on device: cuda:0
68
+ 2023-10-17 15:20:15,011 - embedding storage: none
69
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 15:20:15,011 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 15:20:15,011 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 15:20:15,011 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 15:20:23,073 epoch 1 - iter 144/1445 - loss 2.33324949 - time (sec): 8.06 - samples/sec: 2302.06 - lr: 0.000003 - momentum: 0.000000
75
+ 2023-10-17 15:20:30,031 epoch 1 - iter 288/1445 - loss 1.40669834 - time (sec): 15.02 - samples/sec: 2293.24 - lr: 0.000006 - momentum: 0.000000
76
+ 2023-10-17 15:20:37,277 epoch 1 - iter 432/1445 - loss 0.99311609 - time (sec): 22.26 - samples/sec: 2340.14 - lr: 0.000009 - momentum: 0.000000
77
+ 2023-10-17 15:20:44,228 epoch 1 - iter 576/1445 - loss 0.79532458 - time (sec): 29.22 - samples/sec: 2337.21 - lr: 0.000012 - momentum: 0.000000
78
+ 2023-10-17 15:20:51,036 epoch 1 - iter 720/1445 - loss 0.66249281 - time (sec): 36.02 - samples/sec: 2401.24 - lr: 0.000015 - momentum: 0.000000
79
+ 2023-10-17 15:20:58,193 epoch 1 - iter 864/1445 - loss 0.56791320 - time (sec): 43.18 - samples/sec: 2443.46 - lr: 0.000018 - momentum: 0.000000
80
+ 2023-10-17 15:21:05,065 epoch 1 - iter 1008/1445 - loss 0.50430390 - time (sec): 50.05 - samples/sec: 2460.17 - lr: 0.000021 - momentum: 0.000000
81
+ 2023-10-17 15:21:12,157 epoch 1 - iter 1152/1445 - loss 0.45583800 - time (sec): 57.14 - samples/sec: 2469.61 - lr: 0.000024 - momentum: 0.000000
82
+ 2023-10-17 15:21:19,146 epoch 1 - iter 1296/1445 - loss 0.42025604 - time (sec): 64.13 - samples/sec: 2471.73 - lr: 0.000027 - momentum: 0.000000
83
+ 2023-10-17 15:21:26,123 epoch 1 - iter 1440/1445 - loss 0.39100512 - time (sec): 71.11 - samples/sec: 2470.36 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-17 15:21:26,351 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 15:21:26,351 EPOCH 1 done: loss 0.3901 - lr: 0.000030
86
+ 2023-10-17 15:21:29,006 DEV : loss 0.1134478896856308 - f1-score (micro avg) 0.6988
87
+ 2023-10-17 15:21:29,021 saving best model
88
+ 2023-10-17 15:21:29,419 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 15:21:36,099 epoch 2 - iter 144/1445 - loss 0.11423400 - time (sec): 6.68 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000
90
+ 2023-10-17 15:21:42,808 epoch 2 - iter 288/1445 - loss 0.11006024 - time (sec): 13.39 - samples/sec: 2534.73 - lr: 0.000029 - momentum: 0.000000
91
+ 2023-10-17 15:21:49,527 epoch 2 - iter 432/1445 - loss 0.10104983 - time (sec): 20.11 - samples/sec: 2552.67 - lr: 0.000029 - momentum: 0.000000
92
+ 2023-10-17 15:21:56,423 epoch 2 - iter 576/1445 - loss 0.09617376 - time (sec): 27.00 - samples/sec: 2535.33 - lr: 0.000029 - momentum: 0.000000
93
+ 2023-10-17 15:22:03,695 epoch 2 - iter 720/1445 - loss 0.09369082 - time (sec): 34.27 - samples/sec: 2531.86 - lr: 0.000028 - momentum: 0.000000
94
+ 2023-10-17 15:22:11,248 epoch 2 - iter 864/1445 - loss 0.09042169 - time (sec): 41.83 - samples/sec: 2531.47 - lr: 0.000028 - momentum: 0.000000
95
+ 2023-10-17 15:22:18,355 epoch 2 - iter 1008/1445 - loss 0.09025166 - time (sec): 48.93 - samples/sec: 2507.99 - lr: 0.000028 - momentum: 0.000000
96
+ 2023-10-17 15:22:25,543 epoch 2 - iter 1152/1445 - loss 0.09048999 - time (sec): 56.12 - samples/sec: 2502.98 - lr: 0.000027 - momentum: 0.000000
97
+ 2023-10-17 15:22:32,470 epoch 2 - iter 1296/1445 - loss 0.09136875 - time (sec): 63.05 - samples/sec: 2497.21 - lr: 0.000027 - momentum: 0.000000
98
+ 2023-10-17 15:22:39,747 epoch 2 - iter 1440/1445 - loss 0.09254084 - time (sec): 70.33 - samples/sec: 2499.02 - lr: 0.000027 - momentum: 0.000000
99
+ 2023-10-17 15:22:39,976 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 15:22:39,977 EPOCH 2 done: loss 0.0928 - lr: 0.000027
101
+ 2023-10-17 15:22:43,507 DEV : loss 0.09797008335590363 - f1-score (micro avg) 0.7636
102
+ 2023-10-17 15:22:43,523 saving best model
103
+ 2023-10-17 15:22:44,069 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 15:22:51,178 epoch 3 - iter 144/1445 - loss 0.07299256 - time (sec): 7.11 - samples/sec: 2444.22 - lr: 0.000026 - momentum: 0.000000
105
+ 2023-10-17 15:22:57,977 epoch 3 - iter 288/1445 - loss 0.06634279 - time (sec): 13.91 - samples/sec: 2490.30 - lr: 0.000026 - momentum: 0.000000
106
+ 2023-10-17 15:23:05,074 epoch 3 - iter 432/1445 - loss 0.06581683 - time (sec): 21.00 - samples/sec: 2551.84 - lr: 0.000026 - momentum: 0.000000
107
+ 2023-10-17 15:23:11,936 epoch 3 - iter 576/1445 - loss 0.07201697 - time (sec): 27.87 - samples/sec: 2538.19 - lr: 0.000025 - momentum: 0.000000
108
+ 2023-10-17 15:23:18,903 epoch 3 - iter 720/1445 - loss 0.07104181 - time (sec): 34.83 - samples/sec: 2511.65 - lr: 0.000025 - momentum: 0.000000
109
+ 2023-10-17 15:23:25,927 epoch 3 - iter 864/1445 - loss 0.07023237 - time (sec): 41.86 - samples/sec: 2516.01 - lr: 0.000025 - momentum: 0.000000
110
+ 2023-10-17 15:23:33,061 epoch 3 - iter 1008/1445 - loss 0.06913832 - time (sec): 48.99 - samples/sec: 2490.96 - lr: 0.000024 - momentum: 0.000000
111
+ 2023-10-17 15:23:40,153 epoch 3 - iter 1152/1445 - loss 0.06852697 - time (sec): 56.08 - samples/sec: 2486.94 - lr: 0.000024 - momentum: 0.000000
112
+ 2023-10-17 15:23:47,600 epoch 3 - iter 1296/1445 - loss 0.06920774 - time (sec): 63.53 - samples/sec: 2481.25 - lr: 0.000024 - momentum: 0.000000
113
+ 2023-10-17 15:23:54,842 epoch 3 - iter 1440/1445 - loss 0.06776713 - time (sec): 70.77 - samples/sec: 2483.99 - lr: 0.000023 - momentum: 0.000000
114
+ 2023-10-17 15:23:55,074 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 15:23:55,074 EPOCH 3 done: loss 0.0679 - lr: 0.000023
116
+ 2023-10-17 15:23:58,305 DEV : loss 0.07238871604204178 - f1-score (micro avg) 0.8599
117
+ 2023-10-17 15:23:58,320 saving best model
118
+ 2023-10-17 15:23:58,869 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 15:24:05,938 epoch 4 - iter 144/1445 - loss 0.03825652 - time (sec): 7.07 - samples/sec: 2582.66 - lr: 0.000023 - momentum: 0.000000
120
+ 2023-10-17 15:24:13,011 epoch 4 - iter 288/1445 - loss 0.05121297 - time (sec): 14.14 - samples/sec: 2515.02 - lr: 0.000023 - momentum: 0.000000
121
+ 2023-10-17 15:24:20,248 epoch 4 - iter 432/1445 - loss 0.04707007 - time (sec): 21.38 - samples/sec: 2475.52 - lr: 0.000022 - momentum: 0.000000
122
+ 2023-10-17 15:24:27,099 epoch 4 - iter 576/1445 - loss 0.04864143 - time (sec): 28.23 - samples/sec: 2490.67 - lr: 0.000022 - momentum: 0.000000
123
+ 2023-10-17 15:24:34,071 epoch 4 - iter 720/1445 - loss 0.05041580 - time (sec): 35.20 - samples/sec: 2469.04 - lr: 0.000022 - momentum: 0.000000
124
+ 2023-10-17 15:24:41,242 epoch 4 - iter 864/1445 - loss 0.05057361 - time (sec): 42.37 - samples/sec: 2466.87 - lr: 0.000021 - momentum: 0.000000
125
+ 2023-10-17 15:24:48,213 epoch 4 - iter 1008/1445 - loss 0.04981836 - time (sec): 49.34 - samples/sec: 2478.95 - lr: 0.000021 - momentum: 0.000000
126
+ 2023-10-17 15:24:54,867 epoch 4 - iter 1152/1445 - loss 0.05022101 - time (sec): 56.00 - samples/sec: 2498.44 - lr: 0.000021 - momentum: 0.000000
127
+ 2023-10-17 15:25:01,564 epoch 4 - iter 1296/1445 - loss 0.05005487 - time (sec): 62.69 - samples/sec: 2514.15 - lr: 0.000020 - momentum: 0.000000
128
+ 2023-10-17 15:25:08,655 epoch 4 - iter 1440/1445 - loss 0.05140703 - time (sec): 69.78 - samples/sec: 2519.10 - lr: 0.000020 - momentum: 0.000000
129
+ 2023-10-17 15:25:08,886 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 15:25:08,886 EPOCH 4 done: loss 0.0513 - lr: 0.000020
131
+ 2023-10-17 15:25:12,521 DEV : loss 0.08748035877943039 - f1-score (micro avg) 0.8513
132
+ 2023-10-17 15:25:12,536 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-17 15:25:20,019 epoch 5 - iter 144/1445 - loss 0.02484292 - time (sec): 7.48 - samples/sec: 2365.34 - lr: 0.000020 - momentum: 0.000000
134
+ 2023-10-17 15:25:27,347 epoch 5 - iter 288/1445 - loss 0.02639251 - time (sec): 14.81 - samples/sec: 2426.84 - lr: 0.000019 - momentum: 0.000000
135
+ 2023-10-17 15:25:34,576 epoch 5 - iter 432/1445 - loss 0.03062150 - time (sec): 22.04 - samples/sec: 2438.65 - lr: 0.000019 - momentum: 0.000000
136
+ 2023-10-17 15:25:41,500 epoch 5 - iter 576/1445 - loss 0.03478141 - time (sec): 28.96 - samples/sec: 2433.14 - lr: 0.000019 - momentum: 0.000000
137
+ 2023-10-17 15:25:48,647 epoch 5 - iter 720/1445 - loss 0.03435012 - time (sec): 36.11 - samples/sec: 2431.81 - lr: 0.000018 - momentum: 0.000000
138
+ 2023-10-17 15:25:55,712 epoch 5 - iter 864/1445 - loss 0.03915073 - time (sec): 43.17 - samples/sec: 2429.56 - lr: 0.000018 - momentum: 0.000000
139
+ 2023-10-17 15:26:02,874 epoch 5 - iter 1008/1445 - loss 0.04019423 - time (sec): 50.34 - samples/sec: 2424.89 - lr: 0.000018 - momentum: 0.000000
140
+ 2023-10-17 15:26:09,941 epoch 5 - iter 1152/1445 - loss 0.04137874 - time (sec): 57.40 - samples/sec: 2449.49 - lr: 0.000017 - momentum: 0.000000
141
+ 2023-10-17 15:26:16,823 epoch 5 - iter 1296/1445 - loss 0.04231658 - time (sec): 64.29 - samples/sec: 2455.18 - lr: 0.000017 - momentum: 0.000000
142
+ 2023-10-17 15:26:24,165 epoch 5 - iter 1440/1445 - loss 0.04229848 - time (sec): 71.63 - samples/sec: 2452.34 - lr: 0.000017 - momentum: 0.000000
143
+ 2023-10-17 15:26:24,411 ----------------------------------------------------------------------------------------------------
144
+ 2023-10-17 15:26:24,411 EPOCH 5 done: loss 0.0423 - lr: 0.000017
145
+ 2023-10-17 15:26:27,792 DEV : loss 0.11643949151039124 - f1-score (micro avg) 0.7813
146
+ 2023-10-17 15:26:27,810 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-17 15:26:35,160 epoch 6 - iter 144/1445 - loss 0.04407312 - time (sec): 7.35 - samples/sec: 2381.66 - lr: 0.000016 - momentum: 0.000000
148
+ 2023-10-17 15:26:42,213 epoch 6 - iter 288/1445 - loss 0.07041699 - time (sec): 14.40 - samples/sec: 2363.33 - lr: 0.000016 - momentum: 0.000000
149
+ 2023-10-17 15:26:49,199 epoch 6 - iter 432/1445 - loss 0.06543099 - time (sec): 21.39 - samples/sec: 2412.35 - lr: 0.000016 - momentum: 0.000000
150
+ 2023-10-17 15:26:56,329 epoch 6 - iter 576/1445 - loss 0.05494049 - time (sec): 28.52 - samples/sec: 2447.24 - lr: 0.000015 - momentum: 0.000000
151
+ 2023-10-17 15:27:03,461 epoch 6 - iter 720/1445 - loss 0.05313692 - time (sec): 35.65 - samples/sec: 2468.33 - lr: 0.000015 - momentum: 0.000000
152
+ 2023-10-17 15:27:10,213 epoch 6 - iter 864/1445 - loss 0.04946118 - time (sec): 42.40 - samples/sec: 2462.88 - lr: 0.000015 - momentum: 0.000000
153
+ 2023-10-17 15:27:16,992 epoch 6 - iter 1008/1445 - loss 0.04715937 - time (sec): 49.18 - samples/sec: 2492.01 - lr: 0.000014 - momentum: 0.000000
154
+ 2023-10-17 15:27:23,900 epoch 6 - iter 1152/1445 - loss 0.04549664 - time (sec): 56.09 - samples/sec: 2479.47 - lr: 0.000014 - momentum: 0.000000
155
+ 2023-10-17 15:27:30,845 epoch 6 - iter 1296/1445 - loss 0.04407099 - time (sec): 63.03 - samples/sec: 2486.69 - lr: 0.000014 - momentum: 0.000000
156
+ 2023-10-17 15:27:37,799 epoch 6 - iter 1440/1445 - loss 0.04231390 - time (sec): 69.99 - samples/sec: 2507.53 - lr: 0.000013 - momentum: 0.000000
157
+ 2023-10-17 15:27:38,026 ----------------------------------------------------------------------------------------------------
158
+ 2023-10-17 15:27:38,026 EPOCH 6 done: loss 0.0422 - lr: 0.000013
159
+ 2023-10-17 15:27:41,262 DEV : loss 0.12753015756607056 - f1-score (micro avg) 0.8179
160
+ 2023-10-17 15:27:41,277 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-17 15:27:48,074 epoch 7 - iter 144/1445 - loss 0.02864804 - time (sec): 6.80 - samples/sec: 2541.40 - lr: 0.000013 - momentum: 0.000000
162
+ 2023-10-17 15:27:54,732 epoch 7 - iter 288/1445 - loss 0.02850202 - time (sec): 13.45 - samples/sec: 2529.29 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-17 15:28:01,745 epoch 7 - iter 432/1445 - loss 0.02700154 - time (sec): 20.47 - samples/sec: 2538.36 - lr: 0.000012 - momentum: 0.000000
164
+ 2023-10-17 15:28:09,285 epoch 7 - iter 576/1445 - loss 0.02685090 - time (sec): 28.01 - samples/sec: 2500.37 - lr: 0.000012 - momentum: 0.000000
165
+ 2023-10-17 15:28:16,228 epoch 7 - iter 720/1445 - loss 0.02631576 - time (sec): 34.95 - samples/sec: 2498.86 - lr: 0.000012 - momentum: 0.000000
166
+ 2023-10-17 15:28:23,229 epoch 7 - iter 864/1445 - loss 0.02503229 - time (sec): 41.95 - samples/sec: 2527.10 - lr: 0.000011 - momentum: 0.000000
167
+ 2023-10-17 15:28:30,267 epoch 7 - iter 1008/1445 - loss 0.02322476 - time (sec): 48.99 - samples/sec: 2513.53 - lr: 0.000011 - momentum: 0.000000
168
+ 2023-10-17 15:28:37,208 epoch 7 - iter 1152/1445 - loss 0.02262618 - time (sec): 55.93 - samples/sec: 2507.14 - lr: 0.000011 - momentum: 0.000000
169
+ 2023-10-17 15:28:44,333 epoch 7 - iter 1296/1445 - loss 0.02359839 - time (sec): 63.05 - samples/sec: 2507.22 - lr: 0.000010 - momentum: 0.000000
170
+ 2023-10-17 15:28:51,234 epoch 7 - iter 1440/1445 - loss 0.02410498 - time (sec): 69.96 - samples/sec: 2512.80 - lr: 0.000010 - momentum: 0.000000
171
+ 2023-10-17 15:28:51,455 ----------------------------------------------------------------------------------------------------
172
+ 2023-10-17 15:28:51,456 EPOCH 7 done: loss 0.0241 - lr: 0.000010
173
+ 2023-10-17 15:28:54,762 DEV : loss 0.13132773339748383 - f1-score (micro avg) 0.8242
174
+ 2023-10-17 15:28:54,780 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-17 15:29:01,739 epoch 8 - iter 144/1445 - loss 0.02599645 - time (sec): 6.96 - samples/sec: 2320.98 - lr: 0.000010 - momentum: 0.000000
176
+ 2023-10-17 15:29:09,413 epoch 8 - iter 288/1445 - loss 0.03908539 - time (sec): 14.63 - samples/sec: 2361.44 - lr: 0.000009 - momentum: 0.000000
177
+ 2023-10-17 15:29:16,244 epoch 8 - iter 432/1445 - loss 0.04312177 - time (sec): 21.46 - samples/sec: 2436.90 - lr: 0.000009 - momentum: 0.000000
178
+ 2023-10-17 15:29:23,284 epoch 8 - iter 576/1445 - loss 0.04654217 - time (sec): 28.50 - samples/sec: 2422.07 - lr: 0.000009 - momentum: 0.000000
179
+ 2023-10-17 15:29:30,118 epoch 8 - iter 720/1445 - loss 0.04836767 - time (sec): 35.34 - samples/sec: 2436.55 - lr: 0.000008 - momentum: 0.000000
180
+ 2023-10-17 15:29:37,165 epoch 8 - iter 864/1445 - loss 0.04435024 - time (sec): 42.38 - samples/sec: 2467.42 - lr: 0.000008 - momentum: 0.000000
181
+ 2023-10-17 15:29:44,054 epoch 8 - iter 1008/1445 - loss 0.04368948 - time (sec): 49.27 - samples/sec: 2483.93 - lr: 0.000008 - momentum: 0.000000
182
+ 2023-10-17 15:29:51,080 epoch 8 - iter 1152/1445 - loss 0.04126688 - time (sec): 56.30 - samples/sec: 2473.99 - lr: 0.000007 - momentum: 0.000000
183
+ 2023-10-17 15:29:58,174 epoch 8 - iter 1296/1445 - loss 0.03803387 - time (sec): 63.39 - samples/sec: 2489.99 - lr: 0.000007 - momentum: 0.000000
184
+ 2023-10-17 15:30:05,225 epoch 8 - iter 1440/1445 - loss 0.03611431 - time (sec): 70.44 - samples/sec: 2491.01 - lr: 0.000007 - momentum: 0.000000
185
+ 2023-10-17 15:30:05,479 ----------------------------------------------------------------------------------------------------
186
+ 2023-10-17 15:30:05,479 EPOCH 8 done: loss 0.0360 - lr: 0.000007
187
+ 2023-10-17 15:30:08,822 DEV : loss 0.14063192903995514 - f1-score (micro avg) 0.8385
188
+ 2023-10-17 15:30:08,840 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-17 15:30:16,101 epoch 9 - iter 144/1445 - loss 0.00928856 - time (sec): 7.26 - samples/sec: 2645.70 - lr: 0.000006 - momentum: 0.000000
190
+ 2023-10-17 15:30:22,830 epoch 9 - iter 288/1445 - loss 0.01137305 - time (sec): 13.99 - samples/sec: 2509.03 - lr: 0.000006 - momentum: 0.000000
191
+ 2023-10-17 15:30:30,015 epoch 9 - iter 432/1445 - loss 0.01191848 - time (sec): 21.17 - samples/sec: 2557.99 - lr: 0.000006 - momentum: 0.000000
192
+ 2023-10-17 15:30:37,152 epoch 9 - iter 576/1445 - loss 0.01374329 - time (sec): 28.31 - samples/sec: 2558.62 - lr: 0.000005 - momentum: 0.000000
193
+ 2023-10-17 15:30:44,261 epoch 9 - iter 720/1445 - loss 0.01533254 - time (sec): 35.42 - samples/sec: 2528.74 - lr: 0.000005 - momentum: 0.000000
194
+ 2023-10-17 15:30:51,281 epoch 9 - iter 864/1445 - loss 0.01618231 - time (sec): 42.44 - samples/sec: 2487.04 - lr: 0.000005 - momentum: 0.000000
195
+ 2023-10-17 15:30:58,224 epoch 9 - iter 1008/1445 - loss 0.01852939 - time (sec): 49.38 - samples/sec: 2494.62 - lr: 0.000004 - momentum: 0.000000
196
+ 2023-10-17 15:31:05,713 epoch 9 - iter 1152/1445 - loss 0.02014171 - time (sec): 56.87 - samples/sec: 2484.98 - lr: 0.000004 - momentum: 0.000000
197
+ 2023-10-17 15:31:12,765 epoch 9 - iter 1296/1445 - loss 0.02108402 - time (sec): 63.92 - samples/sec: 2486.65 - lr: 0.000004 - momentum: 0.000000
198
+ 2023-10-17 15:31:19,478 epoch 9 - iter 1440/1445 - loss 0.02161319 - time (sec): 70.64 - samples/sec: 2483.77 - lr: 0.000003 - momentum: 0.000000
199
+ 2023-10-17 15:31:19,741 ----------------------------------------------------------------------------------------------------
200
+ 2023-10-17 15:31:19,741 EPOCH 9 done: loss 0.0215 - lr: 0.000003
201
+ 2023-10-17 15:31:22,962 DEV : loss 0.15872889757156372 - f1-score (micro avg) 0.7959
202
+ 2023-10-17 15:31:22,980 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-17 15:31:29,837 epoch 10 - iter 144/1445 - loss 0.02704362 - time (sec): 6.86 - samples/sec: 2545.17 - lr: 0.000003 - momentum: 0.000000
204
+ 2023-10-17 15:31:36,571 epoch 10 - iter 288/1445 - loss 0.03061519 - time (sec): 13.59 - samples/sec: 2579.66 - lr: 0.000003 - momentum: 0.000000
205
+ 2023-10-17 15:31:43,536 epoch 10 - iter 432/1445 - loss 0.02831624 - time (sec): 20.55 - samples/sec: 2514.98 - lr: 0.000002 - momentum: 0.000000
206
+ 2023-10-17 15:31:50,270 epoch 10 - iter 576/1445 - loss 0.02567299 - time (sec): 27.29 - samples/sec: 2489.86 - lr: 0.000002 - momentum: 0.000000
207
+ 2023-10-17 15:31:57,397 epoch 10 - iter 720/1445 - loss 0.02269889 - time (sec): 34.42 - samples/sec: 2518.68 - lr: 0.000002 - momentum: 0.000000
208
+ 2023-10-17 15:32:04,472 epoch 10 - iter 864/1445 - loss 0.02217955 - time (sec): 41.49 - samples/sec: 2540.09 - lr: 0.000001 - momentum: 0.000000
209
+ 2023-10-17 15:32:11,337 epoch 10 - iter 1008/1445 - loss 0.02100455 - time (sec): 48.36 - samples/sec: 2523.83 - lr: 0.000001 - momentum: 0.000000
210
+ 2023-10-17 15:32:18,547 epoch 10 - iter 1152/1445 - loss 0.02030918 - time (sec): 55.57 - samples/sec: 2521.36 - lr: 0.000001 - momentum: 0.000000
211
+ 2023-10-17 15:32:25,459 epoch 10 - iter 1296/1445 - loss 0.01984945 - time (sec): 62.48 - samples/sec: 2530.72 - lr: 0.000000 - momentum: 0.000000
212
+ 2023-10-17 15:32:32,454 epoch 10 - iter 1440/1445 - loss 0.01908284 - time (sec): 69.47 - samples/sec: 2529.70 - lr: 0.000000 - momentum: 0.000000
213
+ 2023-10-17 15:32:32,675 ----------------------------------------------------------------------------------------------------
214
+ 2023-10-17 15:32:32,675 EPOCH 10 done: loss 0.0191 - lr: 0.000000
215
+ 2023-10-17 15:32:35,914 DEV : loss 0.14746923744678497 - f1-score (micro avg) 0.8243
216
+ 2023-10-17 15:32:36,350 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-17 15:32:36,352 Loading model from best epoch ...
218
+ 2023-10-17 15:32:38,127 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
219
+ 2023-10-17 15:32:40,946
220
+ Results:
221
+ - F-score (micro) 0.8538
222
+ - F-score (macro) 0.7349
223
+ - Accuracy 0.7522
224
+
225
+ By class:
226
+ precision recall f1-score support
227
+
228
+ PER 0.8758 0.8340 0.8544 482
229
+ LOC 0.9385 0.8996 0.9186 458
230
+ ORG 0.4286 0.4348 0.4317 69
231
+
232
+ micro avg 0.8719 0.8365 0.8538 1009
233
+ macro avg 0.7476 0.7228 0.7349 1009
234
+ weighted avg 0.8737 0.8365 0.8546 1009
235
+
236
+ 2023-10-17 15:32:40,946 ----------------------------------------------------------------------------------------------------