2023-10-17 15:20:15,009 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,010 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:20:15,010 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,010 MultiCorpus: 5777 train + 722 dev + 723 test sentences - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl 2023-10-17 15:20:15,010 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,010 Train: 5777 sentences 2023-10-17 15:20:15,011 (train_with_dev=False, train_with_test=False) 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Training Params: 2023-10-17 15:20:15,011 - learning_rate: "3e-05" 2023-10-17 15:20:15,011 - mini_batch_size: "4" 2023-10-17 15:20:15,011 - max_epochs: "10" 2023-10-17 15:20:15,011 - shuffle: "True" 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Plugins: 2023-10-17 15:20:15,011 - TensorboardLogger 2023-10-17 15:20:15,011 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:20:15,011 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Computation: 2023-10-17 15:20:15,011 - compute on device: cuda:0 2023-10-17 15:20:15,011 - embedding storage: none 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:15,011 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:20:23,073 epoch 1 - iter 144/1445 - loss 2.33324949 - time (sec): 8.06 - samples/sec: 2302.06 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:20:30,031 epoch 1 - iter 288/1445 - loss 1.40669834 - time (sec): 15.02 - samples/sec: 2293.24 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:20:37,277 epoch 1 - iter 432/1445 - loss 0.99311609 - time (sec): 22.26 - samples/sec: 2340.14 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:20:44,228 epoch 1 - iter 576/1445 - loss 0.79532458 - time (sec): 29.22 - samples/sec: 2337.21 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:20:51,036 epoch 1 - iter 720/1445 - loss 0.66249281 - time (sec): 36.02 - samples/sec: 2401.24 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:20:58,193 epoch 1 - iter 864/1445 - loss 0.56791320 - time (sec): 43.18 - samples/sec: 2443.46 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:21:05,065 epoch 1 - iter 1008/1445 - loss 0.50430390 - time (sec): 50.05 - samples/sec: 2460.17 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:21:12,157 epoch 1 - iter 1152/1445 - loss 0.45583800 - time (sec): 57.14 - samples/sec: 2469.61 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:21:19,146 epoch 1 - iter 1296/1445 - loss 0.42025604 - time (sec): 64.13 - samples/sec: 2471.73 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:21:26,123 epoch 1 - iter 1440/1445 - loss 0.39100512 - time (sec): 71.11 - samples/sec: 2470.36 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:21:26,351 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:21:26,351 EPOCH 1 done: loss 0.3901 - lr: 0.000030 2023-10-17 15:21:29,006 DEV : loss 0.1134478896856308 - f1-score (micro avg) 0.6988 2023-10-17 15:21:29,021 saving best model 2023-10-17 15:21:29,419 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:21:36,099 epoch 2 - iter 144/1445 - loss 0.11423400 - time (sec): 6.68 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:21:42,808 epoch 2 - iter 288/1445 - loss 0.11006024 - time (sec): 13.39 - samples/sec: 2534.73 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:21:49,527 epoch 2 - iter 432/1445 - loss 0.10104983 - time (sec): 20.11 - samples/sec: 2552.67 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:21:56,423 epoch 2 - iter 576/1445 - loss 0.09617376 - time (sec): 27.00 - samples/sec: 2535.33 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:22:03,695 epoch 2 - iter 720/1445 - loss 0.09369082 - time (sec): 34.27 - samples/sec: 2531.86 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:22:11,248 epoch 2 - iter 864/1445 - loss 0.09042169 - time (sec): 41.83 - samples/sec: 2531.47 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:22:18,355 epoch 2 - iter 1008/1445 - loss 0.09025166 - time (sec): 48.93 - samples/sec: 2507.99 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:22:25,543 epoch 2 - iter 1152/1445 - loss 0.09048999 - time (sec): 56.12 - samples/sec: 2502.98 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:22:32,470 epoch 2 - iter 1296/1445 - loss 0.09136875 - time (sec): 63.05 - samples/sec: 2497.21 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:22:39,747 epoch 2 - iter 1440/1445 - loss 0.09254084 - time (sec): 70.33 - samples/sec: 2499.02 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:22:39,976 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:22:39,977 EPOCH 2 done: loss 0.0928 - lr: 0.000027 2023-10-17 15:22:43,507 DEV : loss 0.09797008335590363 - f1-score (micro avg) 0.7636 2023-10-17 15:22:43,523 saving best model 2023-10-17 15:22:44,069 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:22:51,178 epoch 3 - iter 144/1445 - loss 0.07299256 - time (sec): 7.11 - samples/sec: 2444.22 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:22:57,977 epoch 3 - iter 288/1445 - loss 0.06634279 - time (sec): 13.91 - samples/sec: 2490.30 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:23:05,074 epoch 3 - iter 432/1445 - loss 0.06581683 - time (sec): 21.00 - samples/sec: 2551.84 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:23:11,936 epoch 3 - iter 576/1445 - loss 0.07201697 - time (sec): 27.87 - samples/sec: 2538.19 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:23:18,903 epoch 3 - iter 720/1445 - loss 0.07104181 - time (sec): 34.83 - samples/sec: 2511.65 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:23:25,927 epoch 3 - iter 864/1445 - loss 0.07023237 - time (sec): 41.86 - samples/sec: 2516.01 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:23:33,061 epoch 3 - iter 1008/1445 - loss 0.06913832 - time (sec): 48.99 - samples/sec: 2490.96 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:23:40,153 epoch 3 - iter 1152/1445 - loss 0.06852697 - time (sec): 56.08 - samples/sec: 2486.94 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:23:47,600 epoch 3 - iter 1296/1445 - loss 0.06920774 - time (sec): 63.53 - samples/sec: 2481.25 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:23:54,842 epoch 3 - iter 1440/1445 - loss 0.06776713 - time (sec): 70.77 - samples/sec: 2483.99 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:23:55,074 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:23:55,074 EPOCH 3 done: loss 0.0679 - lr: 0.000023 2023-10-17 15:23:58,305 DEV : loss 0.07238871604204178 - f1-score (micro avg) 0.8599 2023-10-17 15:23:58,320 saving best model 2023-10-17 15:23:58,869 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:24:05,938 epoch 4 - iter 144/1445 - loss 0.03825652 - time (sec): 7.07 - samples/sec: 2582.66 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:24:13,011 epoch 4 - iter 288/1445 - loss 0.05121297 - time (sec): 14.14 - samples/sec: 2515.02 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:24:20,248 epoch 4 - iter 432/1445 - loss 0.04707007 - time (sec): 21.38 - samples/sec: 2475.52 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:24:27,099 epoch 4 - iter 576/1445 - loss 0.04864143 - time (sec): 28.23 - samples/sec: 2490.67 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:24:34,071 epoch 4 - iter 720/1445 - loss 0.05041580 - time (sec): 35.20 - samples/sec: 2469.04 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:24:41,242 epoch 4 - iter 864/1445 - loss 0.05057361 - time (sec): 42.37 - samples/sec: 2466.87 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:24:48,213 epoch 4 - iter 1008/1445 - loss 0.04981836 - time (sec): 49.34 - samples/sec: 2478.95 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:24:54,867 epoch 4 - iter 1152/1445 - loss 0.05022101 - time (sec): 56.00 - samples/sec: 2498.44 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:25:01,564 epoch 4 - iter 1296/1445 - loss 0.05005487 - time (sec): 62.69 - samples/sec: 2514.15 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:25:08,655 epoch 4 - iter 1440/1445 - loss 0.05140703 - time (sec): 69.78 - samples/sec: 2519.10 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:25:08,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:25:08,886 EPOCH 4 done: loss 0.0513 - lr: 0.000020 2023-10-17 15:25:12,521 DEV : loss 0.08748035877943039 - f1-score (micro avg) 0.8513 2023-10-17 15:25:12,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:25:20,019 epoch 5 - iter 144/1445 - loss 0.02484292 - time (sec): 7.48 - samples/sec: 2365.34 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:25:27,347 epoch 5 - iter 288/1445 - loss 0.02639251 - time (sec): 14.81 - samples/sec: 2426.84 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:25:34,576 epoch 5 - iter 432/1445 - loss 0.03062150 - time (sec): 22.04 - samples/sec: 2438.65 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:25:41,500 epoch 5 - iter 576/1445 - loss 0.03478141 - time (sec): 28.96 - samples/sec: 2433.14 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:25:48,647 epoch 5 - iter 720/1445 - loss 0.03435012 - time (sec): 36.11 - samples/sec: 2431.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:25:55,712 epoch 5 - iter 864/1445 - loss 0.03915073 - time (sec): 43.17 - samples/sec: 2429.56 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:26:02,874 epoch 5 - iter 1008/1445 - loss 0.04019423 - time (sec): 50.34 - samples/sec: 2424.89 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:26:09,941 epoch 5 - iter 1152/1445 - loss 0.04137874 - time (sec): 57.40 - samples/sec: 2449.49 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:26:16,823 epoch 5 - iter 1296/1445 - loss 0.04231658 - time (sec): 64.29 - samples/sec: 2455.18 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:26:24,165 epoch 5 - iter 1440/1445 - loss 0.04229848 - time (sec): 71.63 - samples/sec: 2452.34 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:26:24,411 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:26:24,411 EPOCH 5 done: loss 0.0423 - lr: 0.000017 2023-10-17 15:26:27,792 DEV : loss 0.11643949151039124 - f1-score (micro avg) 0.7813 2023-10-17 15:26:27,810 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:26:35,160 epoch 6 - iter 144/1445 - loss 0.04407312 - time (sec): 7.35 - samples/sec: 2381.66 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:26:42,213 epoch 6 - iter 288/1445 - loss 0.07041699 - time (sec): 14.40 - samples/sec: 2363.33 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:26:49,199 epoch 6 - iter 432/1445 - loss 0.06543099 - time (sec): 21.39 - samples/sec: 2412.35 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:26:56,329 epoch 6 - iter 576/1445 - loss 0.05494049 - time (sec): 28.52 - samples/sec: 2447.24 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:27:03,461 epoch 6 - iter 720/1445 - loss 0.05313692 - time (sec): 35.65 - samples/sec: 2468.33 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:27:10,213 epoch 6 - iter 864/1445 - loss 0.04946118 - time (sec): 42.40 - samples/sec: 2462.88 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:27:16,992 epoch 6 - iter 1008/1445 - loss 0.04715937 - time (sec): 49.18 - samples/sec: 2492.01 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:27:23,900 epoch 6 - iter 1152/1445 - loss 0.04549664 - time (sec): 56.09 - samples/sec: 2479.47 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:27:30,845 epoch 6 - iter 1296/1445 - loss 0.04407099 - time (sec): 63.03 - samples/sec: 2486.69 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:27:37,799 epoch 6 - iter 1440/1445 - loss 0.04231390 - time (sec): 69.99 - samples/sec: 2507.53 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:27:38,026 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:27:38,026 EPOCH 6 done: loss 0.0422 - lr: 0.000013 2023-10-17 15:27:41,262 DEV : loss 0.12753015756607056 - f1-score (micro avg) 0.8179 2023-10-17 15:27:41,277 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:27:48,074 epoch 7 - iter 144/1445 - loss 0.02864804 - time (sec): 6.80 - samples/sec: 2541.40 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:27:54,732 epoch 7 - iter 288/1445 - loss 0.02850202 - time (sec): 13.45 - samples/sec: 2529.29 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:28:01,745 epoch 7 - iter 432/1445 - loss 0.02700154 - time (sec): 20.47 - samples/sec: 2538.36 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:28:09,285 epoch 7 - iter 576/1445 - loss 0.02685090 - time (sec): 28.01 - samples/sec: 2500.37 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:28:16,228 epoch 7 - iter 720/1445 - loss 0.02631576 - time (sec): 34.95 - samples/sec: 2498.86 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:28:23,229 epoch 7 - iter 864/1445 - loss 0.02503229 - time (sec): 41.95 - samples/sec: 2527.10 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:28:30,267 epoch 7 - iter 1008/1445 - loss 0.02322476 - time (sec): 48.99 - samples/sec: 2513.53 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:28:37,208 epoch 7 - iter 1152/1445 - loss 0.02262618 - time (sec): 55.93 - samples/sec: 2507.14 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:28:44,333 epoch 7 - iter 1296/1445 - loss 0.02359839 - time (sec): 63.05 - samples/sec: 2507.22 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:28:51,234 epoch 7 - iter 1440/1445 - loss 0.02410498 - time (sec): 69.96 - samples/sec: 2512.80 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:28:51,455 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:28:51,456 EPOCH 7 done: loss 0.0241 - lr: 0.000010 2023-10-17 15:28:54,762 DEV : loss 0.13132773339748383 - f1-score (micro avg) 0.8242 2023-10-17 15:28:54,780 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:29:01,739 epoch 8 - iter 144/1445 - loss 0.02599645 - time (sec): 6.96 - samples/sec: 2320.98 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:29:09,413 epoch 8 - iter 288/1445 - loss 0.03908539 - time (sec): 14.63 - samples/sec: 2361.44 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:29:16,244 epoch 8 - iter 432/1445 - loss 0.04312177 - time (sec): 21.46 - samples/sec: 2436.90 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:29:23,284 epoch 8 - iter 576/1445 - loss 0.04654217 - time (sec): 28.50 - samples/sec: 2422.07 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:29:30,118 epoch 8 - iter 720/1445 - loss 0.04836767 - time (sec): 35.34 - samples/sec: 2436.55 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:29:37,165 epoch 8 - iter 864/1445 - loss 0.04435024 - time (sec): 42.38 - samples/sec: 2467.42 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:29:44,054 epoch 8 - iter 1008/1445 - loss 0.04368948 - time (sec): 49.27 - samples/sec: 2483.93 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:29:51,080 epoch 8 - iter 1152/1445 - loss 0.04126688 - time (sec): 56.30 - samples/sec: 2473.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:29:58,174 epoch 8 - iter 1296/1445 - loss 0.03803387 - time (sec): 63.39 - samples/sec: 2489.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:30:05,225 epoch 8 - iter 1440/1445 - loss 0.03611431 - time (sec): 70.44 - samples/sec: 2491.01 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:30:05,479 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:30:05,479 EPOCH 8 done: loss 0.0360 - lr: 0.000007 2023-10-17 15:30:08,822 DEV : loss 0.14063192903995514 - f1-score (micro avg) 0.8385 2023-10-17 15:30:08,840 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:30:16,101 epoch 9 - iter 144/1445 - loss 0.00928856 - time (sec): 7.26 - samples/sec: 2645.70 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:30:22,830 epoch 9 - iter 288/1445 - loss 0.01137305 - time (sec): 13.99 - samples/sec: 2509.03 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:30:30,015 epoch 9 - iter 432/1445 - loss 0.01191848 - time (sec): 21.17 - samples/sec: 2557.99 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:30:37,152 epoch 9 - iter 576/1445 - loss 0.01374329 - time (sec): 28.31 - samples/sec: 2558.62 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:30:44,261 epoch 9 - iter 720/1445 - loss 0.01533254 - time (sec): 35.42 - samples/sec: 2528.74 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:30:51,281 epoch 9 - iter 864/1445 - loss 0.01618231 - time (sec): 42.44 - samples/sec: 2487.04 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:30:58,224 epoch 9 - iter 1008/1445 - loss 0.01852939 - time (sec): 49.38 - samples/sec: 2494.62 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:31:05,713 epoch 9 - iter 1152/1445 - loss 0.02014171 - time (sec): 56.87 - samples/sec: 2484.98 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:31:12,765 epoch 9 - iter 1296/1445 - loss 0.02108402 - time (sec): 63.92 - samples/sec: 2486.65 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:31:19,478 epoch 9 - iter 1440/1445 - loss 0.02161319 - time (sec): 70.64 - samples/sec: 2483.77 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:31:19,741 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:31:19,741 EPOCH 9 done: loss 0.0215 - lr: 0.000003 2023-10-17 15:31:22,962 DEV : loss 0.15872889757156372 - f1-score (micro avg) 0.7959 2023-10-17 15:31:22,980 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:31:29,837 epoch 10 - iter 144/1445 - loss 0.02704362 - time (sec): 6.86 - samples/sec: 2545.17 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:31:36,571 epoch 10 - iter 288/1445 - loss 0.03061519 - time (sec): 13.59 - samples/sec: 2579.66 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:31:43,536 epoch 10 - iter 432/1445 - loss 0.02831624 - time (sec): 20.55 - samples/sec: 2514.98 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:31:50,270 epoch 10 - iter 576/1445 - loss 0.02567299 - time (sec): 27.29 - samples/sec: 2489.86 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:31:57,397 epoch 10 - iter 720/1445 - loss 0.02269889 - time (sec): 34.42 - samples/sec: 2518.68 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:32:04,472 epoch 10 - iter 864/1445 - loss 0.02217955 - time (sec): 41.49 - samples/sec: 2540.09 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:32:11,337 epoch 10 - iter 1008/1445 - loss 0.02100455 - time (sec): 48.36 - samples/sec: 2523.83 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:32:18,547 epoch 10 - iter 1152/1445 - loss 0.02030918 - time (sec): 55.57 - samples/sec: 2521.36 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:32:25,459 epoch 10 - iter 1296/1445 - loss 0.01984945 - time (sec): 62.48 - samples/sec: 2530.72 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:32:32,454 epoch 10 - iter 1440/1445 - loss 0.01908284 - time (sec): 69.47 - samples/sec: 2529.70 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:32:32,675 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:32:32,675 EPOCH 10 done: loss 0.0191 - lr: 0.000000 2023-10-17 15:32:35,914 DEV : loss 0.14746923744678497 - f1-score (micro avg) 0.8243 2023-10-17 15:32:36,350 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:32:36,352 Loading model from best epoch ... 2023-10-17 15:32:38,127 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 15:32:40,946 Results: - F-score (micro) 0.8538 - F-score (macro) 0.7349 - Accuracy 0.7522 By class: precision recall f1-score support PER 0.8758 0.8340 0.8544 482 LOC 0.9385 0.8996 0.9186 458 ORG 0.4286 0.4348 0.4317 69 micro avg 0.8719 0.8365 0.8538 1009 macro avg 0.7476 0.7228 0.7349 1009 weighted avg 0.8737 0.8365 0.8546 1009 2023-10-17 15:32:40,946 ----------------------------------------------------------------------------------------------------