stefan-it's picture
Upload folder using huggingface_hub
4bdf4fb
2023-10-18 20:51:50,862 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,862 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 20:51:50,862 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Train: 7936 sentences
2023-10-18 20:51:50,863 (train_with_dev=False, train_with_test=False)
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Training Params:
2023-10-18 20:51:50,863 - learning_rate: "3e-05"
2023-10-18 20:51:50,863 - mini_batch_size: "8"
2023-10-18 20:51:50,863 - max_epochs: "10"
2023-10-18 20:51:50,863 - shuffle: "True"
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Plugins:
2023-10-18 20:51:50,863 - TensorboardLogger
2023-10-18 20:51:50,863 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 20:51:50,863 - metric: "('micro avg', 'f1-score')"
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Computation:
2023-10-18 20:51:50,863 - compute on device: cuda:0
2023-10-18 20:51:50,863 - embedding storage: none
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 ----------------------------------------------------------------------------------------------------
2023-10-18 20:51:50,863 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 20:51:53,281 epoch 1 - iter 99/992 - loss 2.42306355 - time (sec): 2.42 - samples/sec: 6817.15 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:51:55,609 epoch 1 - iter 198/992 - loss 2.21764434 - time (sec): 4.75 - samples/sec: 6893.42 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:51:57,874 epoch 1 - iter 297/992 - loss 1.90812069 - time (sec): 7.01 - samples/sec: 7104.88 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:52:00,102 epoch 1 - iter 396/992 - loss 1.60298911 - time (sec): 9.24 - samples/sec: 7162.65 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:52:02,336 epoch 1 - iter 495/992 - loss 1.39460148 - time (sec): 11.47 - samples/sec: 7253.67 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:52:04,570 epoch 1 - iter 594/992 - loss 1.24571099 - time (sec): 13.71 - samples/sec: 7251.89 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:52:06,806 epoch 1 - iter 693/992 - loss 1.13088356 - time (sec): 15.94 - samples/sec: 7261.58 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:52:09,007 epoch 1 - iter 792/992 - loss 1.03882783 - time (sec): 18.14 - samples/sec: 7255.30 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:52:11,267 epoch 1 - iter 891/992 - loss 0.97090243 - time (sec): 20.40 - samples/sec: 7218.69 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:52:13,572 epoch 1 - iter 990/992 - loss 0.91052903 - time (sec): 22.71 - samples/sec: 7208.11 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:52:13,616 ----------------------------------------------------------------------------------------------------
2023-10-18 20:52:13,616 EPOCH 1 done: loss 0.9089 - lr: 0.000030
2023-10-18 20:52:15,124 DEV : loss 0.23747262358665466 - f1-score (micro avg) 0.2194
2023-10-18 20:52:15,143 saving best model
2023-10-18 20:52:15,175 ----------------------------------------------------------------------------------------------------
2023-10-18 20:52:17,403 epoch 2 - iter 99/992 - loss 0.34838492 - time (sec): 2.23 - samples/sec: 7243.16 - lr: 0.000030 - momentum: 0.000000
2023-10-18 20:52:20,106 epoch 2 - iter 198/992 - loss 0.31688513 - time (sec): 4.93 - samples/sec: 6753.96 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:52:22,343 epoch 2 - iter 297/992 - loss 0.31470515 - time (sec): 7.17 - samples/sec: 6839.20 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:52:24,563 epoch 2 - iter 396/992 - loss 0.31410203 - time (sec): 9.39 - samples/sec: 6947.41 - lr: 0.000029 - momentum: 0.000000
2023-10-18 20:52:26,781 epoch 2 - iter 495/992 - loss 0.30497184 - time (sec): 11.61 - samples/sec: 7005.44 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:52:29,033 epoch 2 - iter 594/992 - loss 0.30271100 - time (sec): 13.86 - samples/sec: 7014.07 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:52:31,321 epoch 2 - iter 693/992 - loss 0.30336907 - time (sec): 16.15 - samples/sec: 7001.83 - lr: 0.000028 - momentum: 0.000000
2023-10-18 20:52:33,566 epoch 2 - iter 792/992 - loss 0.30106027 - time (sec): 18.39 - samples/sec: 7067.54 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:52:35,840 epoch 2 - iter 891/992 - loss 0.30102108 - time (sec): 20.66 - samples/sec: 7119.36 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:52:38,098 epoch 2 - iter 990/992 - loss 0.29802060 - time (sec): 22.92 - samples/sec: 7139.84 - lr: 0.000027 - momentum: 0.000000
2023-10-18 20:52:38,146 ----------------------------------------------------------------------------------------------------
2023-10-18 20:52:38,146 EPOCH 2 done: loss 0.2981 - lr: 0.000027
2023-10-18 20:52:39,977 DEV : loss 0.2019718885421753 - f1-score (micro avg) 0.333
2023-10-18 20:52:39,995 saving best model
2023-10-18 20:52:40,030 ----------------------------------------------------------------------------------------------------
2023-10-18 20:52:42,204 epoch 3 - iter 99/992 - loss 0.25674067 - time (sec): 2.17 - samples/sec: 7699.56 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:52:44,405 epoch 3 - iter 198/992 - loss 0.26870989 - time (sec): 4.37 - samples/sec: 7471.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:52:46,670 epoch 3 - iter 297/992 - loss 0.27125039 - time (sec): 6.64 - samples/sec: 7366.32 - lr: 0.000026 - momentum: 0.000000
2023-10-18 20:52:48,945 epoch 3 - iter 396/992 - loss 0.26816170 - time (sec): 8.91 - samples/sec: 7272.18 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:52:51,223 epoch 3 - iter 495/992 - loss 0.26031992 - time (sec): 11.19 - samples/sec: 7295.87 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:52:53,401 epoch 3 - iter 594/992 - loss 0.26294179 - time (sec): 13.37 - samples/sec: 7260.85 - lr: 0.000025 - momentum: 0.000000
2023-10-18 20:52:55,672 epoch 3 - iter 693/992 - loss 0.27016265 - time (sec): 15.64 - samples/sec: 7260.38 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:52:57,880 epoch 3 - iter 792/992 - loss 0.26656315 - time (sec): 17.85 - samples/sec: 7303.47 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:53:00,090 epoch 3 - iter 891/992 - loss 0.26240928 - time (sec): 20.06 - samples/sec: 7327.88 - lr: 0.000024 - momentum: 0.000000
2023-10-18 20:53:02,420 epoch 3 - iter 990/992 - loss 0.26065555 - time (sec): 22.39 - samples/sec: 7304.54 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:53:02,472 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:02,472 EPOCH 3 done: loss 0.2605 - lr: 0.000023
2023-10-18 20:53:04,287 DEV : loss 0.19067135453224182 - f1-score (micro avg) 0.3635
2023-10-18 20:53:04,306 saving best model
2023-10-18 20:53:04,341 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:06,510 epoch 4 - iter 99/992 - loss 0.25239806 - time (sec): 2.17 - samples/sec: 7264.69 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:53:08,706 epoch 4 - iter 198/992 - loss 0.24991484 - time (sec): 4.36 - samples/sec: 7480.62 - lr: 0.000023 - momentum: 0.000000
2023-10-18 20:53:10,917 epoch 4 - iter 297/992 - loss 0.24563286 - time (sec): 6.58 - samples/sec: 7280.98 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:53:13,138 epoch 4 - iter 396/992 - loss 0.24305426 - time (sec): 8.80 - samples/sec: 7222.81 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:53:15,385 epoch 4 - iter 495/992 - loss 0.24520370 - time (sec): 11.04 - samples/sec: 7298.48 - lr: 0.000022 - momentum: 0.000000
2023-10-18 20:53:17,680 epoch 4 - iter 594/992 - loss 0.24264463 - time (sec): 13.34 - samples/sec: 7270.56 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:53:19,961 epoch 4 - iter 693/992 - loss 0.24046413 - time (sec): 15.62 - samples/sec: 7253.51 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:53:22,226 epoch 4 - iter 792/992 - loss 0.24173072 - time (sec): 17.88 - samples/sec: 7263.53 - lr: 0.000021 - momentum: 0.000000
2023-10-18 20:53:24,493 epoch 4 - iter 891/992 - loss 0.23789589 - time (sec): 20.15 - samples/sec: 7283.37 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:53:26,770 epoch 4 - iter 990/992 - loss 0.23782954 - time (sec): 22.43 - samples/sec: 7294.84 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:53:26,816 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:26,816 EPOCH 4 done: loss 0.2376 - lr: 0.000020
2023-10-18 20:53:28,648 DEV : loss 0.17912989854812622 - f1-score (micro avg) 0.3731
2023-10-18 20:53:28,667 saving best model
2023-10-18 20:53:28,703 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:31,192 epoch 5 - iter 99/992 - loss 0.20190982 - time (sec): 2.49 - samples/sec: 6724.85 - lr: 0.000020 - momentum: 0.000000
2023-10-18 20:53:33,466 epoch 5 - iter 198/992 - loss 0.21079062 - time (sec): 4.76 - samples/sec: 6812.55 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:53:35,707 epoch 5 - iter 297/992 - loss 0.21616584 - time (sec): 7.00 - samples/sec: 6809.75 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:53:37,906 epoch 5 - iter 396/992 - loss 0.21632794 - time (sec): 9.20 - samples/sec: 7007.42 - lr: 0.000019 - momentum: 0.000000
2023-10-18 20:53:40,108 epoch 5 - iter 495/992 - loss 0.21820585 - time (sec): 11.40 - samples/sec: 7057.11 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:53:42,289 epoch 5 - iter 594/992 - loss 0.21997889 - time (sec): 13.59 - samples/sec: 7110.83 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:53:44,510 epoch 5 - iter 693/992 - loss 0.22087534 - time (sec): 15.81 - samples/sec: 7163.55 - lr: 0.000018 - momentum: 0.000000
2023-10-18 20:53:46,779 epoch 5 - iter 792/992 - loss 0.21945109 - time (sec): 18.08 - samples/sec: 7190.02 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:53:48,989 epoch 5 - iter 891/992 - loss 0.22199106 - time (sec): 20.29 - samples/sec: 7244.25 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:53:51,198 epoch 5 - iter 990/992 - loss 0.22507917 - time (sec): 22.49 - samples/sec: 7273.09 - lr: 0.000017 - momentum: 0.000000
2023-10-18 20:53:51,244 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:51,244 EPOCH 5 done: loss 0.2252 - lr: 0.000017
2023-10-18 20:53:53,099 DEV : loss 0.17070569097995758 - f1-score (micro avg) 0.388
2023-10-18 20:53:53,117 saving best model
2023-10-18 20:53:53,152 ----------------------------------------------------------------------------------------------------
2023-10-18 20:53:55,176 epoch 6 - iter 99/992 - loss 0.21435041 - time (sec): 2.02 - samples/sec: 8064.65 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:53:57,335 epoch 6 - iter 198/992 - loss 0.20832594 - time (sec): 4.18 - samples/sec: 7647.91 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:53:59,580 epoch 6 - iter 297/992 - loss 0.20727939 - time (sec): 6.43 - samples/sec: 7625.66 - lr: 0.000016 - momentum: 0.000000
2023-10-18 20:54:01,825 epoch 6 - iter 396/992 - loss 0.20671655 - time (sec): 8.67 - samples/sec: 7502.11 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:54:04,029 epoch 6 - iter 495/992 - loss 0.20859057 - time (sec): 10.88 - samples/sec: 7448.22 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:54:06,262 epoch 6 - iter 594/992 - loss 0.20604312 - time (sec): 13.11 - samples/sec: 7436.72 - lr: 0.000015 - momentum: 0.000000
2023-10-18 20:54:08,487 epoch 6 - iter 693/992 - loss 0.20717609 - time (sec): 15.33 - samples/sec: 7394.52 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:54:10,832 epoch 6 - iter 792/992 - loss 0.20730451 - time (sec): 17.68 - samples/sec: 7321.27 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:54:13,157 epoch 6 - iter 891/992 - loss 0.20936951 - time (sec): 20.00 - samples/sec: 7280.12 - lr: 0.000014 - momentum: 0.000000
2023-10-18 20:54:15,481 epoch 6 - iter 990/992 - loss 0.21222474 - time (sec): 22.33 - samples/sec: 7330.00 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:54:15,533 ----------------------------------------------------------------------------------------------------
2023-10-18 20:54:15,533 EPOCH 6 done: loss 0.2121 - lr: 0.000013
2023-10-18 20:54:17,751 DEV : loss 0.16514191031455994 - f1-score (micro avg) 0.3937
2023-10-18 20:54:17,770 saving best model
2023-10-18 20:54:17,806 ----------------------------------------------------------------------------------------------------
2023-10-18 20:54:20,052 epoch 7 - iter 99/992 - loss 0.21964137 - time (sec): 2.25 - samples/sec: 7179.31 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:54:22,341 epoch 7 - iter 198/992 - loss 0.21098183 - time (sec): 4.53 - samples/sec: 7076.71 - lr: 0.000013 - momentum: 0.000000
2023-10-18 20:54:24,694 epoch 7 - iter 297/992 - loss 0.21018550 - time (sec): 6.89 - samples/sec: 7109.21 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:54:26,927 epoch 7 - iter 396/992 - loss 0.21566329 - time (sec): 9.12 - samples/sec: 7173.22 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:54:29,180 epoch 7 - iter 495/992 - loss 0.20941511 - time (sec): 11.37 - samples/sec: 7197.51 - lr: 0.000012 - momentum: 0.000000
2023-10-18 20:54:31,434 epoch 7 - iter 594/992 - loss 0.20773689 - time (sec): 13.63 - samples/sec: 7200.12 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:54:33,652 epoch 7 - iter 693/992 - loss 0.20830501 - time (sec): 15.85 - samples/sec: 7200.02 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:54:35,933 epoch 7 - iter 792/992 - loss 0.20710786 - time (sec): 18.13 - samples/sec: 7175.38 - lr: 0.000011 - momentum: 0.000000
2023-10-18 20:54:38,175 epoch 7 - iter 891/992 - loss 0.20292449 - time (sec): 20.37 - samples/sec: 7254.20 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:54:40,445 epoch 7 - iter 990/992 - loss 0.20430713 - time (sec): 22.64 - samples/sec: 7229.07 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:54:40,490 ----------------------------------------------------------------------------------------------------
2023-10-18 20:54:40,490 EPOCH 7 done: loss 0.2040 - lr: 0.000010
2023-10-18 20:54:42,328 DEV : loss 0.16114094853401184 - f1-score (micro avg) 0.4081
2023-10-18 20:54:42,346 saving best model
2023-10-18 20:54:42,381 ----------------------------------------------------------------------------------------------------
2023-10-18 20:54:44,630 epoch 8 - iter 99/992 - loss 0.19783164 - time (sec): 2.25 - samples/sec: 7440.34 - lr: 0.000010 - momentum: 0.000000
2023-10-18 20:54:46,732 epoch 8 - iter 198/992 - loss 0.19042092 - time (sec): 4.35 - samples/sec: 7605.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:54:48,937 epoch 8 - iter 297/992 - loss 0.19412905 - time (sec): 6.56 - samples/sec: 7746.45 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:54:50,961 epoch 8 - iter 396/992 - loss 0.19712824 - time (sec): 8.58 - samples/sec: 7750.93 - lr: 0.000009 - momentum: 0.000000
2023-10-18 20:54:53,123 epoch 8 - iter 495/992 - loss 0.19440607 - time (sec): 10.74 - samples/sec: 7628.27 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:54:55,313 epoch 8 - iter 594/992 - loss 0.20059272 - time (sec): 12.93 - samples/sec: 7589.21 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:54:57,485 epoch 8 - iter 693/992 - loss 0.19929992 - time (sec): 15.10 - samples/sec: 7557.80 - lr: 0.000008 - momentum: 0.000000
2023-10-18 20:54:59,693 epoch 8 - iter 792/992 - loss 0.19752093 - time (sec): 17.31 - samples/sec: 7525.17 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:55:01,938 epoch 8 - iter 891/992 - loss 0.19794149 - time (sec): 19.56 - samples/sec: 7506.92 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:55:04,171 epoch 8 - iter 990/992 - loss 0.19798238 - time (sec): 21.79 - samples/sec: 7508.41 - lr: 0.000007 - momentum: 0.000000
2023-10-18 20:55:04,220 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:04,220 EPOCH 8 done: loss 0.1978 - lr: 0.000007
2023-10-18 20:55:06,037 DEV : loss 0.16143764555454254 - f1-score (micro avg) 0.4046
2023-10-18 20:55:06,056 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:08,164 epoch 9 - iter 99/992 - loss 0.19886128 - time (sec): 2.11 - samples/sec: 7962.82 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:55:10,429 epoch 9 - iter 198/992 - loss 0.20268705 - time (sec): 4.37 - samples/sec: 7900.46 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:55:12,615 epoch 9 - iter 297/992 - loss 0.19583829 - time (sec): 6.56 - samples/sec: 7683.31 - lr: 0.000006 - momentum: 0.000000
2023-10-18 20:55:14,782 epoch 9 - iter 396/992 - loss 0.19318088 - time (sec): 8.73 - samples/sec: 7552.63 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:55:16,985 epoch 9 - iter 495/992 - loss 0.19300654 - time (sec): 10.93 - samples/sec: 7540.55 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:55:19,255 epoch 9 - iter 594/992 - loss 0.19864381 - time (sec): 13.20 - samples/sec: 7477.14 - lr: 0.000005 - momentum: 0.000000
2023-10-18 20:55:21,468 epoch 9 - iter 693/992 - loss 0.19900744 - time (sec): 15.41 - samples/sec: 7405.54 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:55:23,682 epoch 9 - iter 792/992 - loss 0.19612859 - time (sec): 17.63 - samples/sec: 7398.32 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:55:25,923 epoch 9 - iter 891/992 - loss 0.19581107 - time (sec): 19.87 - samples/sec: 7408.49 - lr: 0.000004 - momentum: 0.000000
2023-10-18 20:55:28,250 epoch 9 - iter 990/992 - loss 0.19433849 - time (sec): 22.19 - samples/sec: 7373.99 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:55:28,298 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:28,298 EPOCH 9 done: loss 0.1941 - lr: 0.000003
2023-10-18 20:55:30,137 DEV : loss 0.16051384806632996 - f1-score (micro avg) 0.4078
2023-10-18 20:55:30,156 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:32,394 epoch 10 - iter 99/992 - loss 0.20330896 - time (sec): 2.24 - samples/sec: 7007.16 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:55:34,656 epoch 10 - iter 198/992 - loss 0.18894617 - time (sec): 4.50 - samples/sec: 7227.09 - lr: 0.000003 - momentum: 0.000000
2023-10-18 20:55:36,948 epoch 10 - iter 297/992 - loss 0.19045374 - time (sec): 6.79 - samples/sec: 7212.88 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:55:39,134 epoch 10 - iter 396/992 - loss 0.19170091 - time (sec): 8.98 - samples/sec: 7256.78 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:55:41,390 epoch 10 - iter 495/992 - loss 0.18876878 - time (sec): 11.23 - samples/sec: 7289.13 - lr: 0.000002 - momentum: 0.000000
2023-10-18 20:55:43,609 epoch 10 - iter 594/992 - loss 0.19227055 - time (sec): 13.45 - samples/sec: 7327.60 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:55:45,880 epoch 10 - iter 693/992 - loss 0.19144431 - time (sec): 15.72 - samples/sec: 7322.12 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:55:48,086 epoch 10 - iter 792/992 - loss 0.19069240 - time (sec): 17.93 - samples/sec: 7344.98 - lr: 0.000001 - momentum: 0.000000
2023-10-18 20:55:50,261 epoch 10 - iter 891/992 - loss 0.19135351 - time (sec): 20.11 - samples/sec: 7328.13 - lr: 0.000000 - momentum: 0.000000
2023-10-18 20:55:52,556 epoch 10 - iter 990/992 - loss 0.19175148 - time (sec): 22.40 - samples/sec: 7310.16 - lr: 0.000000 - momentum: 0.000000
2023-10-18 20:55:52,602 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:52,602 EPOCH 10 done: loss 0.1919 - lr: 0.000000
2023-10-18 20:55:54,456 DEV : loss 0.16066327691078186 - f1-score (micro avg) 0.4119
2023-10-18 20:55:54,475 saving best model
2023-10-18 20:55:54,537 ----------------------------------------------------------------------------------------------------
2023-10-18 20:55:54,538 Loading model from best epoch ...
2023-10-18 20:55:54,607 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 20:55:56,064
Results:
- F-score (micro) 0.4631
- F-score (macro) 0.3022
- Accuracy 0.3478
By class:
precision recall f1-score support
LOC 0.6305 0.5679 0.5976 655
PER 0.2269 0.4843 0.3090 223
ORG 0.0000 0.0000 0.0000 127
micro avg 0.4494 0.4776 0.4631 1005
macro avg 0.2858 0.3507 0.3022 1005
weighted avg 0.4613 0.4776 0.4580 1005
2023-10-18 20:55:56,064 ----------------------------------------------------------------------------------------------------