stefan-it's picture
Upload folder using huggingface_hub
c895f5e
raw
history blame
24.2 kB
2023-10-17 17:38:22,631 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,633 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:38:22,633 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,633 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-17 17:38:22,633 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,633 Train: 20847 sentences
2023-10-17 17:38:22,634 (train_with_dev=False, train_with_test=False)
2023-10-17 17:38:22,634 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,634 Training Params:
2023-10-17 17:38:22,634 - learning_rate: "3e-05"
2023-10-17 17:38:22,634 - mini_batch_size: "8"
2023-10-17 17:38:22,634 - max_epochs: "10"
2023-10-17 17:38:22,634 - shuffle: "True"
2023-10-17 17:38:22,634 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,634 Plugins:
2023-10-17 17:38:22,634 - TensorboardLogger
2023-10-17 17:38:22,634 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:38:22,634 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,634 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:38:22,634 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:38:22,634 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,635 Computation:
2023-10-17 17:38:22,635 - compute on device: cuda:0
2023-10-17 17:38:22,635 - embedding storage: none
2023-10-17 17:38:22,635 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,635 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 17:38:22,635 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,635 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:22,635 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:38:49,992 epoch 1 - iter 260/2606 - loss 2.16108750 - time (sec): 27.36 - samples/sec: 1370.93 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:39:16,294 epoch 1 - iter 520/2606 - loss 1.30692459 - time (sec): 53.66 - samples/sec: 1390.98 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:39:43,218 epoch 1 - iter 780/2606 - loss 0.99754433 - time (sec): 80.58 - samples/sec: 1367.60 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:40:10,453 epoch 1 - iter 1040/2606 - loss 0.82453976 - time (sec): 107.82 - samples/sec: 1349.01 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:40:36,352 epoch 1 - iter 1300/2606 - loss 0.71369382 - time (sec): 133.72 - samples/sec: 1341.33 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:41:03,213 epoch 1 - iter 1560/2606 - loss 0.62993344 - time (sec): 160.58 - samples/sec: 1349.33 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:41:31,274 epoch 1 - iter 1820/2606 - loss 0.57154390 - time (sec): 188.64 - samples/sec: 1337.61 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:41:58,975 epoch 1 - iter 2080/2606 - loss 0.52820388 - time (sec): 216.34 - samples/sec: 1339.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:42:25,833 epoch 1 - iter 2340/2606 - loss 0.49230602 - time (sec): 243.20 - samples/sec: 1335.50 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:42:54,842 epoch 1 - iter 2600/2606 - loss 0.45946927 - time (sec): 272.20 - samples/sec: 1347.07 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:42:55,398 ----------------------------------------------------------------------------------------------------
2023-10-17 17:42:55,399 EPOCH 1 done: loss 0.4589 - lr: 0.000030
2023-10-17 17:43:03,287 DEV : loss 0.12348709255456924 - f1-score (micro avg) 0.3015
2023-10-17 17:43:03,342 saving best model
2023-10-17 17:43:03,884 ----------------------------------------------------------------------------------------------------
2023-10-17 17:43:31,177 epoch 2 - iter 260/2606 - loss 0.17303029 - time (sec): 27.29 - samples/sec: 1347.32 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:43:59,525 epoch 2 - iter 520/2606 - loss 0.16886574 - time (sec): 55.64 - samples/sec: 1373.29 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:44:26,379 epoch 2 - iter 780/2606 - loss 0.16483756 - time (sec): 82.49 - samples/sec: 1359.54 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:44:53,079 epoch 2 - iter 1040/2606 - loss 0.16595272 - time (sec): 109.19 - samples/sec: 1354.15 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:45:19,612 epoch 2 - iter 1300/2606 - loss 0.16465462 - time (sec): 135.73 - samples/sec: 1346.09 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:45:46,866 epoch 2 - iter 1560/2606 - loss 0.16186852 - time (sec): 162.98 - samples/sec: 1348.08 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:46:14,295 epoch 2 - iter 1820/2606 - loss 0.16153027 - time (sec): 190.41 - samples/sec: 1335.51 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:46:41,254 epoch 2 - iter 2080/2606 - loss 0.15817180 - time (sec): 217.37 - samples/sec: 1341.88 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:47:07,690 epoch 2 - iter 2340/2606 - loss 0.15541969 - time (sec): 243.80 - samples/sec: 1349.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:47:35,914 epoch 2 - iter 2600/2606 - loss 0.15324901 - time (sec): 272.03 - samples/sec: 1348.38 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:47:36,492 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:36,492 EPOCH 2 done: loss 0.1533 - lr: 0.000027
2023-10-17 17:47:48,737 DEV : loss 0.16292980313301086 - f1-score (micro avg) 0.341
2023-10-17 17:47:48,800 saving best model
2023-10-17 17:47:50,156 ----------------------------------------------------------------------------------------------------
2023-10-17 17:48:16,842 epoch 3 - iter 260/2606 - loss 0.11265805 - time (sec): 26.68 - samples/sec: 1334.85 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:48:44,542 epoch 3 - iter 520/2606 - loss 0.10845921 - time (sec): 54.38 - samples/sec: 1343.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:49:11,468 epoch 3 - iter 780/2606 - loss 0.10959502 - time (sec): 81.31 - samples/sec: 1348.75 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:49:37,863 epoch 3 - iter 1040/2606 - loss 0.11240173 - time (sec): 107.70 - samples/sec: 1348.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:50:04,462 epoch 3 - iter 1300/2606 - loss 0.11114369 - time (sec): 134.30 - samples/sec: 1352.04 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:50:32,138 epoch 3 - iter 1560/2606 - loss 0.10929810 - time (sec): 161.98 - samples/sec: 1364.81 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:50:57,682 epoch 3 - iter 1820/2606 - loss 0.11045895 - time (sec): 187.52 - samples/sec: 1371.97 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:51:24,187 epoch 3 - iter 2080/2606 - loss 0.11033176 - time (sec): 214.03 - samples/sec: 1378.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:51:49,493 epoch 3 - iter 2340/2606 - loss 0.10971803 - time (sec): 239.33 - samples/sec: 1370.69 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:52:17,879 epoch 3 - iter 2600/2606 - loss 0.10861513 - time (sec): 267.72 - samples/sec: 1370.17 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:52:18,429 ----------------------------------------------------------------------------------------------------
2023-10-17 17:52:18,429 EPOCH 3 done: loss 0.1085 - lr: 0.000023
2023-10-17 17:52:31,403 DEV : loss 0.21983903646469116 - f1-score (micro avg) 0.3474
2023-10-17 17:52:31,463 saving best model
2023-10-17 17:52:32,872 ----------------------------------------------------------------------------------------------------
2023-10-17 17:53:00,390 epoch 4 - iter 260/2606 - loss 0.07713242 - time (sec): 27.51 - samples/sec: 1346.41 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:53:28,664 epoch 4 - iter 520/2606 - loss 0.08561030 - time (sec): 55.79 - samples/sec: 1351.93 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:53:56,211 epoch 4 - iter 780/2606 - loss 0.08498077 - time (sec): 83.33 - samples/sec: 1326.24 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:54:22,655 epoch 4 - iter 1040/2606 - loss 0.08337658 - time (sec): 109.78 - samples/sec: 1336.06 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:54:49,664 epoch 4 - iter 1300/2606 - loss 0.08143883 - time (sec): 136.79 - samples/sec: 1339.71 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:55:16,666 epoch 4 - iter 1560/2606 - loss 0.08262716 - time (sec): 163.79 - samples/sec: 1336.65 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:55:44,033 epoch 4 - iter 1820/2606 - loss 0.08312524 - time (sec): 191.16 - samples/sec: 1336.14 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:56:13,210 epoch 4 - iter 2080/2606 - loss 0.08305110 - time (sec): 220.33 - samples/sec: 1332.00 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:56:40,320 epoch 4 - iter 2340/2606 - loss 0.08202034 - time (sec): 247.44 - samples/sec: 1329.58 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:57:08,440 epoch 4 - iter 2600/2606 - loss 0.08136397 - time (sec): 275.56 - samples/sec: 1330.33 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:57:09,019 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:09,019 EPOCH 4 done: loss 0.0813 - lr: 0.000020
2023-10-17 17:57:21,370 DEV : loss 0.27437207102775574 - f1-score (micro avg) 0.4033
2023-10-17 17:57:21,425 saving best model
2023-10-17 17:57:22,832 ----------------------------------------------------------------------------------------------------
2023-10-17 17:57:50,404 epoch 5 - iter 260/2606 - loss 0.05833828 - time (sec): 27.57 - samples/sec: 1316.84 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:58:17,100 epoch 5 - iter 520/2606 - loss 0.05838581 - time (sec): 54.26 - samples/sec: 1338.09 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:58:45,873 epoch 5 - iter 780/2606 - loss 0.05325511 - time (sec): 83.04 - samples/sec: 1335.64 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:59:14,285 epoch 5 - iter 1040/2606 - loss 0.05210183 - time (sec): 111.45 - samples/sec: 1331.46 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:59:42,645 epoch 5 - iter 1300/2606 - loss 0.05559593 - time (sec): 139.81 - samples/sec: 1323.08 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:00:10,380 epoch 5 - iter 1560/2606 - loss 0.05487736 - time (sec): 167.54 - samples/sec: 1329.02 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:00:37,660 epoch 5 - iter 1820/2606 - loss 0.05506955 - time (sec): 194.82 - samples/sec: 1328.17 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:01:04,469 epoch 5 - iter 2080/2606 - loss 0.05519986 - time (sec): 221.63 - samples/sec: 1324.06 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:01:32,780 epoch 5 - iter 2340/2606 - loss 0.05451422 - time (sec): 249.94 - samples/sec: 1326.95 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:01:59,316 epoch 5 - iter 2600/2606 - loss 0.05460912 - time (sec): 276.48 - samples/sec: 1326.47 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:01:59,877 ----------------------------------------------------------------------------------------------------
2023-10-17 18:01:59,877 EPOCH 5 done: loss 0.0546 - lr: 0.000017
2023-10-17 18:02:12,437 DEV : loss 0.2996568977832794 - f1-score (micro avg) 0.3881
2023-10-17 18:02:12,493 ----------------------------------------------------------------------------------------------------
2023-10-17 18:02:40,068 epoch 6 - iter 260/2606 - loss 0.03423666 - time (sec): 27.57 - samples/sec: 1364.83 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:03:06,656 epoch 6 - iter 520/2606 - loss 0.03348953 - time (sec): 54.16 - samples/sec: 1321.35 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:03:33,792 epoch 6 - iter 780/2606 - loss 0.03525955 - time (sec): 81.30 - samples/sec: 1315.45 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:04:00,415 epoch 6 - iter 1040/2606 - loss 0.03884942 - time (sec): 107.92 - samples/sec: 1313.70 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:04:27,780 epoch 6 - iter 1300/2606 - loss 0.03894248 - time (sec): 135.28 - samples/sec: 1310.85 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:04:55,792 epoch 6 - iter 1560/2606 - loss 0.03858817 - time (sec): 163.30 - samples/sec: 1314.79 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:05:23,682 epoch 6 - iter 1820/2606 - loss 0.03919379 - time (sec): 191.19 - samples/sec: 1315.67 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:05:53,070 epoch 6 - iter 2080/2606 - loss 0.03898069 - time (sec): 220.57 - samples/sec: 1320.80 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:06:21,530 epoch 6 - iter 2340/2606 - loss 0.03869181 - time (sec): 249.03 - samples/sec: 1332.24 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:06:48,297 epoch 6 - iter 2600/2606 - loss 0.03893522 - time (sec): 275.80 - samples/sec: 1328.89 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:06:49,003 ----------------------------------------------------------------------------------------------------
2023-10-17 18:06:49,003 EPOCH 6 done: loss 0.0389 - lr: 0.000013
2023-10-17 18:07:01,825 DEV : loss 0.34783273935317993 - f1-score (micro avg) 0.4035
2023-10-17 18:07:01,879 saving best model
2023-10-17 18:07:03,283 ----------------------------------------------------------------------------------------------------
2023-10-17 18:07:31,404 epoch 7 - iter 260/2606 - loss 0.02894101 - time (sec): 28.12 - samples/sec: 1363.81 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:07:58,656 epoch 7 - iter 520/2606 - loss 0.02724580 - time (sec): 55.37 - samples/sec: 1352.93 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:08:27,592 epoch 7 - iter 780/2606 - loss 0.03067550 - time (sec): 84.30 - samples/sec: 1333.06 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:08:54,620 epoch 7 - iter 1040/2606 - loss 0.02932069 - time (sec): 111.33 - samples/sec: 1319.87 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:09:21,942 epoch 7 - iter 1300/2606 - loss 0.02883956 - time (sec): 138.65 - samples/sec: 1322.62 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:09:49,932 epoch 7 - iter 1560/2606 - loss 0.02806752 - time (sec): 166.65 - samples/sec: 1331.28 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:16,796 epoch 7 - iter 1820/2606 - loss 0.02809168 - time (sec): 193.51 - samples/sec: 1329.05 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:10:44,355 epoch 7 - iter 2080/2606 - loss 0.02817759 - time (sec): 221.07 - samples/sec: 1321.32 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:11:12,175 epoch 7 - iter 2340/2606 - loss 0.02835946 - time (sec): 248.89 - samples/sec: 1318.43 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:40,537 epoch 7 - iter 2600/2606 - loss 0.02853150 - time (sec): 277.25 - samples/sec: 1320.98 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:11:41,332 ----------------------------------------------------------------------------------------------------
2023-10-17 18:11:41,332 EPOCH 7 done: loss 0.0285 - lr: 0.000010
2023-10-17 18:11:52,778 DEV : loss 0.4449138045310974 - f1-score (micro avg) 0.3797
2023-10-17 18:11:52,843 ----------------------------------------------------------------------------------------------------
2023-10-17 18:12:20,333 epoch 8 - iter 260/2606 - loss 0.02355353 - time (sec): 27.49 - samples/sec: 1308.83 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:12:48,478 epoch 8 - iter 520/2606 - loss 0.02242287 - time (sec): 55.63 - samples/sec: 1301.33 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:13:17,210 epoch 8 - iter 780/2606 - loss 0.02142082 - time (sec): 84.36 - samples/sec: 1302.22 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:13:44,116 epoch 8 - iter 1040/2606 - loss 0.02198135 - time (sec): 111.27 - samples/sec: 1318.24 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:14:11,882 epoch 8 - iter 1300/2606 - loss 0.02245041 - time (sec): 139.04 - samples/sec: 1320.33 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:14:38,817 epoch 8 - iter 1560/2606 - loss 0.02235451 - time (sec): 165.97 - samples/sec: 1320.02 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:15:06,477 epoch 8 - iter 1820/2606 - loss 0.02124624 - time (sec): 193.63 - samples/sec: 1321.69 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:15:33,535 epoch 8 - iter 2080/2606 - loss 0.02100098 - time (sec): 220.69 - samples/sec: 1321.00 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:16:00,553 epoch 8 - iter 2340/2606 - loss 0.02124546 - time (sec): 247.71 - samples/sec: 1323.91 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:16:29,052 epoch 8 - iter 2600/2606 - loss 0.02098403 - time (sec): 276.21 - samples/sec: 1327.93 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:16:29,596 ----------------------------------------------------------------------------------------------------
2023-10-17 18:16:29,597 EPOCH 8 done: loss 0.0211 - lr: 0.000007
2023-10-17 18:16:40,893 DEV : loss 0.5216322541236877 - f1-score (micro avg) 0.3707
2023-10-17 18:16:40,956 ----------------------------------------------------------------------------------------------------
2023-10-17 18:17:09,504 epoch 9 - iter 260/2606 - loss 0.01271966 - time (sec): 28.55 - samples/sec: 1427.43 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:17:36,981 epoch 9 - iter 520/2606 - loss 0.01328728 - time (sec): 56.02 - samples/sec: 1396.29 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:18:04,506 epoch 9 - iter 780/2606 - loss 0.01441554 - time (sec): 83.55 - samples/sec: 1357.80 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:18:32,793 epoch 9 - iter 1040/2606 - loss 0.01396545 - time (sec): 111.83 - samples/sec: 1344.18 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:18:59,609 epoch 9 - iter 1300/2606 - loss 0.01392302 - time (sec): 138.65 - samples/sec: 1349.91 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:19:26,372 epoch 9 - iter 1560/2606 - loss 0.01326058 - time (sec): 165.41 - samples/sec: 1349.25 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:19:53,465 epoch 9 - iter 1820/2606 - loss 0.01302468 - time (sec): 192.51 - samples/sec: 1344.95 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:20:21,078 epoch 9 - iter 2080/2606 - loss 0.01339090 - time (sec): 220.12 - samples/sec: 1348.97 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:20:48,720 epoch 9 - iter 2340/2606 - loss 0.01350316 - time (sec): 247.76 - samples/sec: 1349.79 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:21:13,941 epoch 9 - iter 2600/2606 - loss 0.01339440 - time (sec): 272.98 - samples/sec: 1343.38 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:21:14,446 ----------------------------------------------------------------------------------------------------
2023-10-17 18:21:14,446 EPOCH 9 done: loss 0.0135 - lr: 0.000003
2023-10-17 18:21:25,731 DEV : loss 0.5246009230613708 - f1-score (micro avg) 0.3787
2023-10-17 18:21:25,786 ----------------------------------------------------------------------------------------------------
2023-10-17 18:21:53,457 epoch 10 - iter 260/2606 - loss 0.01018569 - time (sec): 27.67 - samples/sec: 1373.20 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:22:22,398 epoch 10 - iter 520/2606 - loss 0.01216228 - time (sec): 56.61 - samples/sec: 1368.55 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:22:48,583 epoch 10 - iter 780/2606 - loss 0.01210997 - time (sec): 82.79 - samples/sec: 1337.18 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:23:16,814 epoch 10 - iter 1040/2606 - loss 0.01155669 - time (sec): 111.03 - samples/sec: 1338.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:23:44,121 epoch 10 - iter 1300/2606 - loss 0.01112163 - time (sec): 138.33 - samples/sec: 1322.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:24:11,721 epoch 10 - iter 1560/2606 - loss 0.01128677 - time (sec): 165.93 - samples/sec: 1324.56 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:24:41,182 epoch 10 - iter 1820/2606 - loss 0.01108459 - time (sec): 195.39 - samples/sec: 1327.32 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:25:08,860 epoch 10 - iter 2080/2606 - loss 0.01112871 - time (sec): 223.07 - samples/sec: 1326.85 - lr: 0.000001 - momentum: 0.000000
2023-10-17 18:25:35,011 epoch 10 - iter 2340/2606 - loss 0.01122759 - time (sec): 249.22 - samples/sec: 1322.79 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:26:02,919 epoch 10 - iter 2600/2606 - loss 0.01076733 - time (sec): 277.13 - samples/sec: 1322.70 - lr: 0.000000 - momentum: 0.000000
2023-10-17 18:26:03,521 ----------------------------------------------------------------------------------------------------
2023-10-17 18:26:03,521 EPOCH 10 done: loss 0.0107 - lr: 0.000000
2023-10-17 18:26:14,910 DEV : loss 0.5314543843269348 - f1-score (micro avg) 0.3801
2023-10-17 18:26:15,512 ----------------------------------------------------------------------------------------------------
2023-10-17 18:26:15,514 Loading model from best epoch ...
2023-10-17 18:26:18,076 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 18:26:38,263
Results:
- F-score (micro) 0.4345
- F-score (macro) 0.3022
- Accuracy 0.282
By class:
precision recall f1-score support
LOC 0.4256 0.5255 0.4703 1214
PER 0.4325 0.4480 0.4401 808
ORG 0.2940 0.3031 0.2985 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.4091 0.4632 0.4345 2390
macro avg 0.2880 0.3192 0.3022 2390
weighted avg 0.4058 0.4632 0.4318 2390
2023-10-17 18:26:38,263 ----------------------------------------------------------------------------------------------------