2023-10-17 17:38:22,631 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,633 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 17:38:22,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,633 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-17 17:38:22,633 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,633 Train: 20847 sentences 2023-10-17 17:38:22,634 (train_with_dev=False, train_with_test=False) 2023-10-17 17:38:22,634 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,634 Training Params: 2023-10-17 17:38:22,634 - learning_rate: "3e-05" 2023-10-17 17:38:22,634 - mini_batch_size: "8" 2023-10-17 17:38:22,634 - max_epochs: "10" 2023-10-17 17:38:22,634 - shuffle: "True" 2023-10-17 17:38:22,634 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,634 Plugins: 2023-10-17 17:38:22,634 - TensorboardLogger 2023-10-17 17:38:22,634 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 17:38:22,634 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,634 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 17:38:22,634 - metric: "('micro avg', 'f1-score')" 2023-10-17 17:38:22,634 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,635 Computation: 2023-10-17 17:38:22,635 - compute on device: cuda:0 2023-10-17 17:38:22,635 - embedding storage: none 2023-10-17 17:38:22,635 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,635 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 17:38:22,635 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,635 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:38:22,635 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 17:38:49,992 epoch 1 - iter 260/2606 - loss 2.16108750 - time (sec): 27.36 - samples/sec: 1370.93 - lr: 0.000003 - momentum: 0.000000 2023-10-17 17:39:16,294 epoch 1 - iter 520/2606 - loss 1.30692459 - time (sec): 53.66 - samples/sec: 1390.98 - lr: 0.000006 - momentum: 0.000000 2023-10-17 17:39:43,218 epoch 1 - iter 780/2606 - loss 0.99754433 - time (sec): 80.58 - samples/sec: 1367.60 - lr: 0.000009 - momentum: 0.000000 2023-10-17 17:40:10,453 epoch 1 - iter 1040/2606 - loss 0.82453976 - time (sec): 107.82 - samples/sec: 1349.01 - lr: 0.000012 - momentum: 0.000000 2023-10-17 17:40:36,352 epoch 1 - iter 1300/2606 - loss 0.71369382 - time (sec): 133.72 - samples/sec: 1341.33 - lr: 0.000015 - momentum: 0.000000 2023-10-17 17:41:03,213 epoch 1 - iter 1560/2606 - loss 0.62993344 - time (sec): 160.58 - samples/sec: 1349.33 - lr: 0.000018 - momentum: 0.000000 2023-10-17 17:41:31,274 epoch 1 - iter 1820/2606 - loss 0.57154390 - time (sec): 188.64 - samples/sec: 1337.61 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:41:58,975 epoch 1 - iter 2080/2606 - loss 0.52820388 - time (sec): 216.34 - samples/sec: 1339.02 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:42:25,833 epoch 1 - iter 2340/2606 - loss 0.49230602 - time (sec): 243.20 - samples/sec: 1335.50 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:42:54,842 epoch 1 - iter 2600/2606 - loss 0.45946927 - time (sec): 272.20 - samples/sec: 1347.07 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:42:55,398 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:42:55,399 EPOCH 1 done: loss 0.4589 - lr: 0.000030 2023-10-17 17:43:03,287 DEV : loss 0.12348709255456924 - f1-score (micro avg) 0.3015 2023-10-17 17:43:03,342 saving best model 2023-10-17 17:43:03,884 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:43:31,177 epoch 2 - iter 260/2606 - loss 0.17303029 - time (sec): 27.29 - samples/sec: 1347.32 - lr: 0.000030 - momentum: 0.000000 2023-10-17 17:43:59,525 epoch 2 - iter 520/2606 - loss 0.16886574 - time (sec): 55.64 - samples/sec: 1373.29 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:44:26,379 epoch 2 - iter 780/2606 - loss 0.16483756 - time (sec): 82.49 - samples/sec: 1359.54 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:44:53,079 epoch 2 - iter 1040/2606 - loss 0.16595272 - time (sec): 109.19 - samples/sec: 1354.15 - lr: 0.000029 - momentum: 0.000000 2023-10-17 17:45:19,612 epoch 2 - iter 1300/2606 - loss 0.16465462 - time (sec): 135.73 - samples/sec: 1346.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:45:46,866 epoch 2 - iter 1560/2606 - loss 0.16186852 - time (sec): 162.98 - samples/sec: 1348.08 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:46:14,295 epoch 2 - iter 1820/2606 - loss 0.16153027 - time (sec): 190.41 - samples/sec: 1335.51 - lr: 0.000028 - momentum: 0.000000 2023-10-17 17:46:41,254 epoch 2 - iter 2080/2606 - loss 0.15817180 - time (sec): 217.37 - samples/sec: 1341.88 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:47:07,690 epoch 2 - iter 2340/2606 - loss 0.15541969 - time (sec): 243.80 - samples/sec: 1349.13 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:47:35,914 epoch 2 - iter 2600/2606 - loss 0.15324901 - time (sec): 272.03 - samples/sec: 1348.38 - lr: 0.000027 - momentum: 0.000000 2023-10-17 17:47:36,492 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:47:36,492 EPOCH 2 done: loss 0.1533 - lr: 0.000027 2023-10-17 17:47:48,737 DEV : loss 0.16292980313301086 - f1-score (micro avg) 0.341 2023-10-17 17:47:48,800 saving best model 2023-10-17 17:47:50,156 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:48:16,842 epoch 3 - iter 260/2606 - loss 0.11265805 - time (sec): 26.68 - samples/sec: 1334.85 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:48:44,542 epoch 3 - iter 520/2606 - loss 0.10845921 - time (sec): 54.38 - samples/sec: 1343.73 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:49:11,468 epoch 3 - iter 780/2606 - loss 0.10959502 - time (sec): 81.31 - samples/sec: 1348.75 - lr: 0.000026 - momentum: 0.000000 2023-10-17 17:49:37,863 epoch 3 - iter 1040/2606 - loss 0.11240173 - time (sec): 107.70 - samples/sec: 1348.81 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:50:04,462 epoch 3 - iter 1300/2606 - loss 0.11114369 - time (sec): 134.30 - samples/sec: 1352.04 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:50:32,138 epoch 3 - iter 1560/2606 - loss 0.10929810 - time (sec): 161.98 - samples/sec: 1364.81 - lr: 0.000025 - momentum: 0.000000 2023-10-17 17:50:57,682 epoch 3 - iter 1820/2606 - loss 0.11045895 - time (sec): 187.52 - samples/sec: 1371.97 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:51:24,187 epoch 3 - iter 2080/2606 - loss 0.11033176 - time (sec): 214.03 - samples/sec: 1378.25 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:51:49,493 epoch 3 - iter 2340/2606 - loss 0.10971803 - time (sec): 239.33 - samples/sec: 1370.69 - lr: 0.000024 - momentum: 0.000000 2023-10-17 17:52:17,879 epoch 3 - iter 2600/2606 - loss 0.10861513 - time (sec): 267.72 - samples/sec: 1370.17 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:52:18,429 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:52:18,429 EPOCH 3 done: loss 0.1085 - lr: 0.000023 2023-10-17 17:52:31,403 DEV : loss 0.21983903646469116 - f1-score (micro avg) 0.3474 2023-10-17 17:52:31,463 saving best model 2023-10-17 17:52:32,872 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:53:00,390 epoch 4 - iter 260/2606 - loss 0.07713242 - time (sec): 27.51 - samples/sec: 1346.41 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:53:28,664 epoch 4 - iter 520/2606 - loss 0.08561030 - time (sec): 55.79 - samples/sec: 1351.93 - lr: 0.000023 - momentum: 0.000000 2023-10-17 17:53:56,211 epoch 4 - iter 780/2606 - loss 0.08498077 - time (sec): 83.33 - samples/sec: 1326.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:54:22,655 epoch 4 - iter 1040/2606 - loss 0.08337658 - time (sec): 109.78 - samples/sec: 1336.06 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:54:49,664 epoch 4 - iter 1300/2606 - loss 0.08143883 - time (sec): 136.79 - samples/sec: 1339.71 - lr: 0.000022 - momentum: 0.000000 2023-10-17 17:55:16,666 epoch 4 - iter 1560/2606 - loss 0.08262716 - time (sec): 163.79 - samples/sec: 1336.65 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:55:44,033 epoch 4 - iter 1820/2606 - loss 0.08312524 - time (sec): 191.16 - samples/sec: 1336.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:56:13,210 epoch 4 - iter 2080/2606 - loss 0.08305110 - time (sec): 220.33 - samples/sec: 1332.00 - lr: 0.000021 - momentum: 0.000000 2023-10-17 17:56:40,320 epoch 4 - iter 2340/2606 - loss 0.08202034 - time (sec): 247.44 - samples/sec: 1329.58 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:57:08,440 epoch 4 - iter 2600/2606 - loss 0.08136397 - time (sec): 275.56 - samples/sec: 1330.33 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:57:09,019 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:09,019 EPOCH 4 done: loss 0.0813 - lr: 0.000020 2023-10-17 17:57:21,370 DEV : loss 0.27437207102775574 - f1-score (micro avg) 0.4033 2023-10-17 17:57:21,425 saving best model 2023-10-17 17:57:22,832 ---------------------------------------------------------------------------------------------------- 2023-10-17 17:57:50,404 epoch 5 - iter 260/2606 - loss 0.05833828 - time (sec): 27.57 - samples/sec: 1316.84 - lr: 0.000020 - momentum: 0.000000 2023-10-17 17:58:17,100 epoch 5 - iter 520/2606 - loss 0.05838581 - time (sec): 54.26 - samples/sec: 1338.09 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:58:45,873 epoch 5 - iter 780/2606 - loss 0.05325511 - time (sec): 83.04 - samples/sec: 1335.64 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:59:14,285 epoch 5 - iter 1040/2606 - loss 0.05210183 - time (sec): 111.45 - samples/sec: 1331.46 - lr: 0.000019 - momentum: 0.000000 2023-10-17 17:59:42,645 epoch 5 - iter 1300/2606 - loss 0.05559593 - time (sec): 139.81 - samples/sec: 1323.08 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:00:10,380 epoch 5 - iter 1560/2606 - loss 0.05487736 - time (sec): 167.54 - samples/sec: 1329.02 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:00:37,660 epoch 5 - iter 1820/2606 - loss 0.05506955 - time (sec): 194.82 - samples/sec: 1328.17 - lr: 0.000018 - momentum: 0.000000 2023-10-17 18:01:04,469 epoch 5 - iter 2080/2606 - loss 0.05519986 - time (sec): 221.63 - samples/sec: 1324.06 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:01:32,780 epoch 5 - iter 2340/2606 - loss 0.05451422 - time (sec): 249.94 - samples/sec: 1326.95 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:01:59,316 epoch 5 - iter 2600/2606 - loss 0.05460912 - time (sec): 276.48 - samples/sec: 1326.47 - lr: 0.000017 - momentum: 0.000000 2023-10-17 18:01:59,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:01:59,877 EPOCH 5 done: loss 0.0546 - lr: 0.000017 2023-10-17 18:02:12,437 DEV : loss 0.2996568977832794 - f1-score (micro avg) 0.3881 2023-10-17 18:02:12,493 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:02:40,068 epoch 6 - iter 260/2606 - loss 0.03423666 - time (sec): 27.57 - samples/sec: 1364.83 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:03:06,656 epoch 6 - iter 520/2606 - loss 0.03348953 - time (sec): 54.16 - samples/sec: 1321.35 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:03:33,792 epoch 6 - iter 780/2606 - loss 0.03525955 - time (sec): 81.30 - samples/sec: 1315.45 - lr: 0.000016 - momentum: 0.000000 2023-10-17 18:04:00,415 epoch 6 - iter 1040/2606 - loss 0.03884942 - time (sec): 107.92 - samples/sec: 1313.70 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:04:27,780 epoch 6 - iter 1300/2606 - loss 0.03894248 - time (sec): 135.28 - samples/sec: 1310.85 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:04:55,792 epoch 6 - iter 1560/2606 - loss 0.03858817 - time (sec): 163.30 - samples/sec: 1314.79 - lr: 0.000015 - momentum: 0.000000 2023-10-17 18:05:23,682 epoch 6 - iter 1820/2606 - loss 0.03919379 - time (sec): 191.19 - samples/sec: 1315.67 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:05:53,070 epoch 6 - iter 2080/2606 - loss 0.03898069 - time (sec): 220.57 - samples/sec: 1320.80 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:06:21,530 epoch 6 - iter 2340/2606 - loss 0.03869181 - time (sec): 249.03 - samples/sec: 1332.24 - lr: 0.000014 - momentum: 0.000000 2023-10-17 18:06:48,297 epoch 6 - iter 2600/2606 - loss 0.03893522 - time (sec): 275.80 - samples/sec: 1328.89 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:06:49,003 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:06:49,003 EPOCH 6 done: loss 0.0389 - lr: 0.000013 2023-10-17 18:07:01,825 DEV : loss 0.34783273935317993 - f1-score (micro avg) 0.4035 2023-10-17 18:07:01,879 saving best model 2023-10-17 18:07:03,283 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:07:31,404 epoch 7 - iter 260/2606 - loss 0.02894101 - time (sec): 28.12 - samples/sec: 1363.81 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:07:58,656 epoch 7 - iter 520/2606 - loss 0.02724580 - time (sec): 55.37 - samples/sec: 1352.93 - lr: 0.000013 - momentum: 0.000000 2023-10-17 18:08:27,592 epoch 7 - iter 780/2606 - loss 0.03067550 - time (sec): 84.30 - samples/sec: 1333.06 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:08:54,620 epoch 7 - iter 1040/2606 - loss 0.02932069 - time (sec): 111.33 - samples/sec: 1319.87 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:09:21,942 epoch 7 - iter 1300/2606 - loss 0.02883956 - time (sec): 138.65 - samples/sec: 1322.62 - lr: 0.000012 - momentum: 0.000000 2023-10-17 18:09:49,932 epoch 7 - iter 1560/2606 - loss 0.02806752 - time (sec): 166.65 - samples/sec: 1331.28 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:10:16,796 epoch 7 - iter 1820/2606 - loss 0.02809168 - time (sec): 193.51 - samples/sec: 1329.05 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:10:44,355 epoch 7 - iter 2080/2606 - loss 0.02817759 - time (sec): 221.07 - samples/sec: 1321.32 - lr: 0.000011 - momentum: 0.000000 2023-10-17 18:11:12,175 epoch 7 - iter 2340/2606 - loss 0.02835946 - time (sec): 248.89 - samples/sec: 1318.43 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:40,537 epoch 7 - iter 2600/2606 - loss 0.02853150 - time (sec): 277.25 - samples/sec: 1320.98 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:11:41,332 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:11:41,332 EPOCH 7 done: loss 0.0285 - lr: 0.000010 2023-10-17 18:11:52,778 DEV : loss 0.4449138045310974 - f1-score (micro avg) 0.3797 2023-10-17 18:11:52,843 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:12:20,333 epoch 8 - iter 260/2606 - loss 0.02355353 - time (sec): 27.49 - samples/sec: 1308.83 - lr: 0.000010 - momentum: 0.000000 2023-10-17 18:12:48,478 epoch 8 - iter 520/2606 - loss 0.02242287 - time (sec): 55.63 - samples/sec: 1301.33 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:13:17,210 epoch 8 - iter 780/2606 - loss 0.02142082 - time (sec): 84.36 - samples/sec: 1302.22 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:13:44,116 epoch 8 - iter 1040/2606 - loss 0.02198135 - time (sec): 111.27 - samples/sec: 1318.24 - lr: 0.000009 - momentum: 0.000000 2023-10-17 18:14:11,882 epoch 8 - iter 1300/2606 - loss 0.02245041 - time (sec): 139.04 - samples/sec: 1320.33 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:14:38,817 epoch 8 - iter 1560/2606 - loss 0.02235451 - time (sec): 165.97 - samples/sec: 1320.02 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:15:06,477 epoch 8 - iter 1820/2606 - loss 0.02124624 - time (sec): 193.63 - samples/sec: 1321.69 - lr: 0.000008 - momentum: 0.000000 2023-10-17 18:15:33,535 epoch 8 - iter 2080/2606 - loss 0.02100098 - time (sec): 220.69 - samples/sec: 1321.00 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:16:00,553 epoch 8 - iter 2340/2606 - loss 0.02124546 - time (sec): 247.71 - samples/sec: 1323.91 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:16:29,052 epoch 8 - iter 2600/2606 - loss 0.02098403 - time (sec): 276.21 - samples/sec: 1327.93 - lr: 0.000007 - momentum: 0.000000 2023-10-17 18:16:29,596 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:16:29,597 EPOCH 8 done: loss 0.0211 - lr: 0.000007 2023-10-17 18:16:40,893 DEV : loss 0.5216322541236877 - f1-score (micro avg) 0.3707 2023-10-17 18:16:40,956 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:17:09,504 epoch 9 - iter 260/2606 - loss 0.01271966 - time (sec): 28.55 - samples/sec: 1427.43 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:17:36,981 epoch 9 - iter 520/2606 - loss 0.01328728 - time (sec): 56.02 - samples/sec: 1396.29 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:18:04,506 epoch 9 - iter 780/2606 - loss 0.01441554 - time (sec): 83.55 - samples/sec: 1357.80 - lr: 0.000006 - momentum: 0.000000 2023-10-17 18:18:32,793 epoch 9 - iter 1040/2606 - loss 0.01396545 - time (sec): 111.83 - samples/sec: 1344.18 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:18:59,609 epoch 9 - iter 1300/2606 - loss 0.01392302 - time (sec): 138.65 - samples/sec: 1349.91 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:19:26,372 epoch 9 - iter 1560/2606 - loss 0.01326058 - time (sec): 165.41 - samples/sec: 1349.25 - lr: 0.000005 - momentum: 0.000000 2023-10-17 18:19:53,465 epoch 9 - iter 1820/2606 - loss 0.01302468 - time (sec): 192.51 - samples/sec: 1344.95 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:20:21,078 epoch 9 - iter 2080/2606 - loss 0.01339090 - time (sec): 220.12 - samples/sec: 1348.97 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:20:48,720 epoch 9 - iter 2340/2606 - loss 0.01350316 - time (sec): 247.76 - samples/sec: 1349.79 - lr: 0.000004 - momentum: 0.000000 2023-10-17 18:21:13,941 epoch 9 - iter 2600/2606 - loss 0.01339440 - time (sec): 272.98 - samples/sec: 1343.38 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:21:14,446 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:21:14,446 EPOCH 9 done: loss 0.0135 - lr: 0.000003 2023-10-17 18:21:25,731 DEV : loss 0.5246009230613708 - f1-score (micro avg) 0.3787 2023-10-17 18:21:25,786 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:21:53,457 epoch 10 - iter 260/2606 - loss 0.01018569 - time (sec): 27.67 - samples/sec: 1373.20 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:22:22,398 epoch 10 - iter 520/2606 - loss 0.01216228 - time (sec): 56.61 - samples/sec: 1368.55 - lr: 0.000003 - momentum: 0.000000 2023-10-17 18:22:48,583 epoch 10 - iter 780/2606 - loss 0.01210997 - time (sec): 82.79 - samples/sec: 1337.18 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:23:16,814 epoch 10 - iter 1040/2606 - loss 0.01155669 - time (sec): 111.03 - samples/sec: 1338.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:23:44,121 epoch 10 - iter 1300/2606 - loss 0.01112163 - time (sec): 138.33 - samples/sec: 1322.20 - lr: 0.000002 - momentum: 0.000000 2023-10-17 18:24:11,721 epoch 10 - iter 1560/2606 - loss 0.01128677 - time (sec): 165.93 - samples/sec: 1324.56 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:24:41,182 epoch 10 - iter 1820/2606 - loss 0.01108459 - time (sec): 195.39 - samples/sec: 1327.32 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:25:08,860 epoch 10 - iter 2080/2606 - loss 0.01112871 - time (sec): 223.07 - samples/sec: 1326.85 - lr: 0.000001 - momentum: 0.000000 2023-10-17 18:25:35,011 epoch 10 - iter 2340/2606 - loss 0.01122759 - time (sec): 249.22 - samples/sec: 1322.79 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:26:02,919 epoch 10 - iter 2600/2606 - loss 0.01076733 - time (sec): 277.13 - samples/sec: 1322.70 - lr: 0.000000 - momentum: 0.000000 2023-10-17 18:26:03,521 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:26:03,521 EPOCH 10 done: loss 0.0107 - lr: 0.000000 2023-10-17 18:26:14,910 DEV : loss 0.5314543843269348 - f1-score (micro avg) 0.3801 2023-10-17 18:26:15,512 ---------------------------------------------------------------------------------------------------- 2023-10-17 18:26:15,514 Loading model from best epoch ... 2023-10-17 18:26:18,076 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 18:26:38,263 Results: - F-score (micro) 0.4345 - F-score (macro) 0.3022 - Accuracy 0.282 By class: precision recall f1-score support LOC 0.4256 0.5255 0.4703 1214 PER 0.4325 0.4480 0.4401 808 ORG 0.2940 0.3031 0.2985 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.4091 0.4632 0.4345 2390 macro avg 0.2880 0.3192 0.3022 2390 weighted avg 0.4058 0.4632 0.4318 2390 2023-10-17 18:26:38,263 ----------------------------------------------------------------------------------------------------