2023-10-13 16:32:27,478 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,479 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 16:32:27,479 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences - NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Train: 5901 sentences 2023-10-13 16:32:27,480 (train_with_dev=False, train_with_test=False) 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Training Params: 2023-10-13 16:32:27,480 - learning_rate: "5e-05" 2023-10-13 16:32:27,480 - mini_batch_size: "4" 2023-10-13 16:32:27,480 - max_epochs: "10" 2023-10-13 16:32:27,480 - shuffle: "True" 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Plugins: 2023-10-13 16:32:27,480 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 16:32:27,480 - metric: "('micro avg', 'f1-score')" 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Computation: 2023-10-13 16:32:27,480 - compute on device: cuda:0 2023-10-13 16:32:27,480 - embedding storage: none 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:27,480 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:32:34,656 epoch 1 - iter 147/1476 - loss 2.19264043 - time (sec): 7.17 - samples/sec: 2468.65 - lr: 0.000005 - momentum: 0.000000 2023-10-13 16:32:41,887 epoch 1 - iter 294/1476 - loss 1.34715324 - time (sec): 14.41 - samples/sec: 2494.63 - lr: 0.000010 - momentum: 0.000000 2023-10-13 16:32:48,820 epoch 1 - iter 441/1476 - loss 1.03252218 - time (sec): 21.34 - samples/sec: 2436.06 - lr: 0.000015 - momentum: 0.000000 2023-10-13 16:32:55,839 epoch 1 - iter 588/1476 - loss 0.84618778 - time (sec): 28.36 - samples/sec: 2423.52 - lr: 0.000020 - momentum: 0.000000 2023-10-13 16:33:03,052 epoch 1 - iter 735/1476 - loss 0.73617883 - time (sec): 35.57 - samples/sec: 2406.00 - lr: 0.000025 - momentum: 0.000000 2023-10-13 16:33:10,075 epoch 1 - iter 882/1476 - loss 0.65221352 - time (sec): 42.59 - samples/sec: 2401.55 - lr: 0.000030 - momentum: 0.000000 2023-10-13 16:33:16,986 epoch 1 - iter 1029/1476 - loss 0.59171549 - time (sec): 49.50 - samples/sec: 2395.74 - lr: 0.000035 - momentum: 0.000000 2023-10-13 16:33:23,748 epoch 1 - iter 1176/1476 - loss 0.54902197 - time (sec): 56.27 - samples/sec: 2370.02 - lr: 0.000040 - momentum: 0.000000 2023-10-13 16:33:30,731 epoch 1 - iter 1323/1476 - loss 0.50983066 - time (sec): 63.25 - samples/sec: 2366.30 - lr: 0.000045 - momentum: 0.000000 2023-10-13 16:33:37,571 epoch 1 - iter 1470/1476 - loss 0.47806023 - time (sec): 70.09 - samples/sec: 2366.46 - lr: 0.000050 - momentum: 0.000000 2023-10-13 16:33:37,841 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:33:37,841 EPOCH 1 done: loss 0.4771 - lr: 0.000050 2023-10-13 16:33:44,025 DEV : loss 0.14038856327533722 - f1-score (micro avg) 0.6914 2023-10-13 16:33:44,053 saving best model 2023-10-13 16:33:44,489 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:33:51,114 epoch 2 - iter 147/1476 - loss 0.13041987 - time (sec): 6.62 - samples/sec: 2298.32 - lr: 0.000049 - momentum: 0.000000 2023-10-13 16:33:57,959 epoch 2 - iter 294/1476 - loss 0.13920225 - time (sec): 13.47 - samples/sec: 2333.22 - lr: 0.000049 - momentum: 0.000000 2023-10-13 16:34:04,703 epoch 2 - iter 441/1476 - loss 0.13806486 - time (sec): 20.21 - samples/sec: 2363.50 - lr: 0.000048 - momentum: 0.000000 2023-10-13 16:34:11,574 epoch 2 - iter 588/1476 - loss 0.13521110 - time (sec): 27.08 - samples/sec: 2354.34 - lr: 0.000048 - momentum: 0.000000 2023-10-13 16:34:18,376 epoch 2 - iter 735/1476 - loss 0.14108629 - time (sec): 33.89 - samples/sec: 2331.03 - lr: 0.000047 - momentum: 0.000000 2023-10-13 16:34:25,372 epoch 2 - iter 882/1476 - loss 0.13917247 - time (sec): 40.88 - samples/sec: 2349.90 - lr: 0.000047 - momentum: 0.000000 2023-10-13 16:34:32,628 epoch 2 - iter 1029/1476 - loss 0.13722354 - time (sec): 48.14 - samples/sec: 2377.81 - lr: 0.000046 - momentum: 0.000000 2023-10-13 16:34:39,515 epoch 2 - iter 1176/1476 - loss 0.13237447 - time (sec): 55.02 - samples/sec: 2379.51 - lr: 0.000046 - momentum: 0.000000 2023-10-13 16:34:46,566 epoch 2 - iter 1323/1476 - loss 0.13528695 - time (sec): 62.08 - samples/sec: 2384.90 - lr: 0.000045 - momentum: 0.000000 2023-10-13 16:34:53,726 epoch 2 - iter 1470/1476 - loss 0.13631749 - time (sec): 69.24 - samples/sec: 2391.96 - lr: 0.000044 - momentum: 0.000000 2023-10-13 16:34:54,002 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:34:54,002 EPOCH 2 done: loss 0.1364 - lr: 0.000044 2023-10-13 16:35:05,260 DEV : loss 0.14942534267902374 - f1-score (micro avg) 0.7366 2023-10-13 16:35:05,289 saving best model 2023-10-13 16:35:05,779 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:35:12,635 epoch 3 - iter 147/1476 - loss 0.07591353 - time (sec): 6.85 - samples/sec: 2265.56 - lr: 0.000044 - momentum: 0.000000 2023-10-13 16:35:19,528 epoch 3 - iter 294/1476 - loss 0.09227817 - time (sec): 13.74 - samples/sec: 2354.13 - lr: 0.000043 - momentum: 0.000000 2023-10-13 16:35:26,354 epoch 3 - iter 441/1476 - loss 0.09626614 - time (sec): 20.57 - samples/sec: 2363.80 - lr: 0.000043 - momentum: 0.000000 2023-10-13 16:35:33,095 epoch 3 - iter 588/1476 - loss 0.09646343 - time (sec): 27.31 - samples/sec: 2347.82 - lr: 0.000042 - momentum: 0.000000 2023-10-13 16:35:40,240 epoch 3 - iter 735/1476 - loss 0.09416145 - time (sec): 34.45 - samples/sec: 2367.35 - lr: 0.000042 - momentum: 0.000000 2023-10-13 16:35:47,371 epoch 3 - iter 882/1476 - loss 0.09456799 - time (sec): 41.59 - samples/sec: 2406.31 - lr: 0.000041 - momentum: 0.000000 2023-10-13 16:35:54,299 epoch 3 - iter 1029/1476 - loss 0.09149587 - time (sec): 48.51 - samples/sec: 2393.68 - lr: 0.000041 - momentum: 0.000000 2023-10-13 16:36:01,363 epoch 3 - iter 1176/1476 - loss 0.09190785 - time (sec): 55.58 - samples/sec: 2410.90 - lr: 0.000040 - momentum: 0.000000 2023-10-13 16:36:08,187 epoch 3 - iter 1323/1476 - loss 0.09115192 - time (sec): 62.40 - samples/sec: 2407.82 - lr: 0.000039 - momentum: 0.000000 2023-10-13 16:36:15,108 epoch 3 - iter 1470/1476 - loss 0.09131695 - time (sec): 69.32 - samples/sec: 2391.98 - lr: 0.000039 - momentum: 0.000000 2023-10-13 16:36:15,376 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:36:15,376 EPOCH 3 done: loss 0.0912 - lr: 0.000039 2023-10-13 16:36:26,447 DEV : loss 0.17109665274620056 - f1-score (micro avg) 0.7761 2023-10-13 16:36:26,476 saving best model 2023-10-13 16:36:27,050 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:36:33,918 epoch 4 - iter 147/1476 - loss 0.06448851 - time (sec): 6.86 - samples/sec: 2220.09 - lr: 0.000038 - momentum: 0.000000 2023-10-13 16:36:40,610 epoch 4 - iter 294/1476 - loss 0.05935983 - time (sec): 13.56 - samples/sec: 2294.42 - lr: 0.000038 - momentum: 0.000000 2023-10-13 16:36:47,491 epoch 4 - iter 441/1476 - loss 0.06577537 - time (sec): 20.44 - samples/sec: 2339.67 - lr: 0.000037 - momentum: 0.000000 2023-10-13 16:36:54,183 epoch 4 - iter 588/1476 - loss 0.06370900 - time (sec): 27.13 - samples/sec: 2330.56 - lr: 0.000037 - momentum: 0.000000 2023-10-13 16:37:01,145 epoch 4 - iter 735/1476 - loss 0.06430938 - time (sec): 34.09 - samples/sec: 2344.21 - lr: 0.000036 - momentum: 0.000000 2023-10-13 16:37:08,310 epoch 4 - iter 882/1476 - loss 0.06262401 - time (sec): 41.26 - samples/sec: 2361.23 - lr: 0.000036 - momentum: 0.000000 2023-10-13 16:37:15,650 epoch 4 - iter 1029/1476 - loss 0.06226410 - time (sec): 48.60 - samples/sec: 2394.58 - lr: 0.000035 - momentum: 0.000000 2023-10-13 16:37:22,567 epoch 4 - iter 1176/1476 - loss 0.06322309 - time (sec): 55.51 - samples/sec: 2395.17 - lr: 0.000034 - momentum: 0.000000 2023-10-13 16:37:29,580 epoch 4 - iter 1323/1476 - loss 0.06623547 - time (sec): 62.53 - samples/sec: 2394.70 - lr: 0.000034 - momentum: 0.000000 2023-10-13 16:37:36,240 epoch 4 - iter 1470/1476 - loss 0.06439991 - time (sec): 69.18 - samples/sec: 2395.39 - lr: 0.000033 - momentum: 0.000000 2023-10-13 16:37:36,522 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:37:36,522 EPOCH 4 done: loss 0.0647 - lr: 0.000033 2023-10-13 16:37:47,651 DEV : loss 0.22011879086494446 - f1-score (micro avg) 0.7734 2023-10-13 16:37:47,680 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:37:54,663 epoch 5 - iter 147/1476 - loss 0.05406225 - time (sec): 6.98 - samples/sec: 2371.22 - lr: 0.000033 - momentum: 0.000000 2023-10-13 16:38:02,070 epoch 5 - iter 294/1476 - loss 0.05659229 - time (sec): 14.39 - samples/sec: 2313.32 - lr: 0.000032 - momentum: 0.000000 2023-10-13 16:38:09,156 epoch 5 - iter 441/1476 - loss 0.05185982 - time (sec): 21.48 - samples/sec: 2344.21 - lr: 0.000032 - momentum: 0.000000 2023-10-13 16:38:15,969 epoch 5 - iter 588/1476 - loss 0.05025944 - time (sec): 28.29 - samples/sec: 2347.40 - lr: 0.000031 - momentum: 0.000000 2023-10-13 16:38:22,835 epoch 5 - iter 735/1476 - loss 0.04794011 - time (sec): 35.15 - samples/sec: 2348.11 - lr: 0.000031 - momentum: 0.000000 2023-10-13 16:38:29,650 epoch 5 - iter 882/1476 - loss 0.04879169 - time (sec): 41.97 - samples/sec: 2336.67 - lr: 0.000030 - momentum: 0.000000 2023-10-13 16:38:36,556 epoch 5 - iter 1029/1476 - loss 0.04784084 - time (sec): 48.88 - samples/sec: 2340.55 - lr: 0.000029 - momentum: 0.000000 2023-10-13 16:38:43,817 epoch 5 - iter 1176/1476 - loss 0.04785536 - time (sec): 56.14 - samples/sec: 2366.35 - lr: 0.000029 - momentum: 0.000000 2023-10-13 16:38:50,949 epoch 5 - iter 1323/1476 - loss 0.04673364 - time (sec): 63.27 - samples/sec: 2382.16 - lr: 0.000028 - momentum: 0.000000 2023-10-13 16:38:57,668 epoch 5 - iter 1470/1476 - loss 0.04668226 - time (sec): 69.99 - samples/sec: 2369.74 - lr: 0.000028 - momentum: 0.000000 2023-10-13 16:38:57,926 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:38:57,927 EPOCH 5 done: loss 0.0466 - lr: 0.000028 2023-10-13 16:39:09,057 DEV : loss 0.18591712415218353 - f1-score (micro avg) 0.7918 2023-10-13 16:39:09,086 saving best model 2023-10-13 16:39:09,678 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:39:16,341 epoch 6 - iter 147/1476 - loss 0.02523404 - time (sec): 6.66 - samples/sec: 2241.56 - lr: 0.000027 - momentum: 0.000000 2023-10-13 16:39:23,617 epoch 6 - iter 294/1476 - loss 0.02890210 - time (sec): 13.94 - samples/sec: 2449.63 - lr: 0.000027 - momentum: 0.000000 2023-10-13 16:39:30,543 epoch 6 - iter 441/1476 - loss 0.03369659 - time (sec): 20.86 - samples/sec: 2447.46 - lr: 0.000026 - momentum: 0.000000 2023-10-13 16:39:37,495 epoch 6 - iter 588/1476 - loss 0.03450237 - time (sec): 27.81 - samples/sec: 2424.06 - lr: 0.000026 - momentum: 0.000000 2023-10-13 16:39:44,573 epoch 6 - iter 735/1476 - loss 0.03447603 - time (sec): 34.89 - samples/sec: 2434.04 - lr: 0.000025 - momentum: 0.000000 2023-10-13 16:39:51,749 epoch 6 - iter 882/1476 - loss 0.03728195 - time (sec): 42.07 - samples/sec: 2428.75 - lr: 0.000024 - momentum: 0.000000 2023-10-13 16:39:58,407 epoch 6 - iter 1029/1476 - loss 0.03609371 - time (sec): 48.73 - samples/sec: 2409.30 - lr: 0.000024 - momentum: 0.000000 2023-10-13 16:40:05,348 epoch 6 - iter 1176/1476 - loss 0.03396739 - time (sec): 55.67 - samples/sec: 2406.36 - lr: 0.000023 - momentum: 0.000000 2023-10-13 16:40:12,191 epoch 6 - iter 1323/1476 - loss 0.03505791 - time (sec): 62.51 - samples/sec: 2396.41 - lr: 0.000023 - momentum: 0.000000 2023-10-13 16:40:19,135 epoch 6 - iter 1470/1476 - loss 0.03535902 - time (sec): 69.45 - samples/sec: 2389.64 - lr: 0.000022 - momentum: 0.000000 2023-10-13 16:40:19,401 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:40:19,401 EPOCH 6 done: loss 0.0353 - lr: 0.000022 2023-10-13 16:40:30,593 DEV : loss 0.20590120553970337 - f1-score (micro avg) 0.7935 2023-10-13 16:40:30,622 saving best model 2023-10-13 16:40:31,204 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:40:38,565 epoch 7 - iter 147/1476 - loss 0.01875450 - time (sec): 7.36 - samples/sec: 2324.91 - lr: 0.000022 - momentum: 0.000000 2023-10-13 16:40:45,309 epoch 7 - iter 294/1476 - loss 0.01827635 - time (sec): 14.10 - samples/sec: 2279.02 - lr: 0.000021 - momentum: 0.000000 2023-10-13 16:40:52,625 epoch 7 - iter 441/1476 - loss 0.02228262 - time (sec): 21.42 - samples/sec: 2360.36 - lr: 0.000021 - momentum: 0.000000 2023-10-13 16:41:00,112 epoch 7 - iter 588/1476 - loss 0.02120762 - time (sec): 28.91 - samples/sec: 2398.25 - lr: 0.000020 - momentum: 0.000000 2023-10-13 16:41:06,805 epoch 7 - iter 735/1476 - loss 0.02126680 - time (sec): 35.60 - samples/sec: 2370.88 - lr: 0.000019 - momentum: 0.000000 2023-10-13 16:41:13,433 epoch 7 - iter 882/1476 - loss 0.02140474 - time (sec): 42.23 - samples/sec: 2362.59 - lr: 0.000019 - momentum: 0.000000 2023-10-13 16:41:19,855 epoch 7 - iter 1029/1476 - loss 0.02223445 - time (sec): 48.65 - samples/sec: 2381.63 - lr: 0.000018 - momentum: 0.000000 2023-10-13 16:41:26,472 epoch 7 - iter 1176/1476 - loss 0.02219911 - time (sec): 55.26 - samples/sec: 2380.18 - lr: 0.000018 - momentum: 0.000000 2023-10-13 16:41:33,320 epoch 7 - iter 1323/1476 - loss 0.02216569 - time (sec): 62.11 - samples/sec: 2376.25 - lr: 0.000017 - momentum: 0.000000 2023-10-13 16:41:40,533 epoch 7 - iter 1470/1476 - loss 0.02249640 - time (sec): 69.33 - samples/sec: 2392.97 - lr: 0.000017 - momentum: 0.000000 2023-10-13 16:41:40,788 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:41:40,788 EPOCH 7 done: loss 0.0226 - lr: 0.000017 2023-10-13 16:41:51,974 DEV : loss 0.23674637079238892 - f1-score (micro avg) 0.7971 2023-10-13 16:41:52,003 saving best model 2023-10-13 16:41:52,500 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:41:59,567 epoch 8 - iter 147/1476 - loss 0.01813003 - time (sec): 7.07 - samples/sec: 2377.41 - lr: 0.000016 - momentum: 0.000000 2023-10-13 16:42:06,585 epoch 8 - iter 294/1476 - loss 0.01758301 - time (sec): 14.08 - samples/sec: 2397.59 - lr: 0.000016 - momentum: 0.000000 2023-10-13 16:42:13,868 epoch 8 - iter 441/1476 - loss 0.02116504 - time (sec): 21.37 - samples/sec: 2488.85 - lr: 0.000015 - momentum: 0.000000 2023-10-13 16:42:20,522 epoch 8 - iter 588/1476 - loss 0.02109938 - time (sec): 28.02 - samples/sec: 2426.58 - lr: 0.000014 - momentum: 0.000000 2023-10-13 16:42:27,485 epoch 8 - iter 735/1476 - loss 0.01950654 - time (sec): 34.98 - samples/sec: 2394.96 - lr: 0.000014 - momentum: 0.000000 2023-10-13 16:42:34,305 epoch 8 - iter 882/1476 - loss 0.01842251 - time (sec): 41.80 - samples/sec: 2372.96 - lr: 0.000013 - momentum: 0.000000 2023-10-13 16:42:41,420 epoch 8 - iter 1029/1476 - loss 0.01674893 - time (sec): 48.92 - samples/sec: 2354.63 - lr: 0.000013 - momentum: 0.000000 2023-10-13 16:42:48,112 epoch 8 - iter 1176/1476 - loss 0.01663061 - time (sec): 55.61 - samples/sec: 2343.53 - lr: 0.000012 - momentum: 0.000000 2023-10-13 16:42:55,261 epoch 8 - iter 1323/1476 - loss 0.01632528 - time (sec): 62.76 - samples/sec: 2359.37 - lr: 0.000012 - momentum: 0.000000 2023-10-13 16:43:02,148 epoch 8 - iter 1470/1476 - loss 0.01581879 - time (sec): 69.65 - samples/sec: 2380.38 - lr: 0.000011 - momentum: 0.000000 2023-10-13 16:43:02,409 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:43:02,409 EPOCH 8 done: loss 0.0158 - lr: 0.000011 2023-10-13 16:43:13,521 DEV : loss 0.25357791781425476 - f1-score (micro avg) 0.7984 2023-10-13 16:43:13,550 saving best model 2023-10-13 16:43:14,117 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:43:21,014 epoch 9 - iter 147/1476 - loss 0.00927471 - time (sec): 6.89 - samples/sec: 2259.68 - lr: 0.000011 - momentum: 0.000000 2023-10-13 16:43:28,062 epoch 9 - iter 294/1476 - loss 0.00809657 - time (sec): 13.94 - samples/sec: 2364.85 - lr: 0.000010 - momentum: 0.000000 2023-10-13 16:43:34,763 epoch 9 - iter 441/1476 - loss 0.00782182 - time (sec): 20.64 - samples/sec: 2353.66 - lr: 0.000009 - momentum: 0.000000 2023-10-13 16:43:41,562 epoch 9 - iter 588/1476 - loss 0.00861677 - time (sec): 27.44 - samples/sec: 2378.37 - lr: 0.000009 - momentum: 0.000000 2023-10-13 16:43:48,725 epoch 9 - iter 735/1476 - loss 0.00955088 - time (sec): 34.60 - samples/sec: 2408.79 - lr: 0.000008 - momentum: 0.000000 2023-10-13 16:43:55,520 epoch 9 - iter 882/1476 - loss 0.00855623 - time (sec): 41.40 - samples/sec: 2391.02 - lr: 0.000008 - momentum: 0.000000 2023-10-13 16:44:02,540 epoch 9 - iter 1029/1476 - loss 0.00809094 - time (sec): 48.42 - samples/sec: 2397.34 - lr: 0.000007 - momentum: 0.000000 2023-10-13 16:44:09,395 epoch 9 - iter 1176/1476 - loss 0.00852934 - time (sec): 55.27 - samples/sec: 2381.55 - lr: 0.000007 - momentum: 0.000000 2023-10-13 16:44:16,493 epoch 9 - iter 1323/1476 - loss 0.00844392 - time (sec): 62.37 - samples/sec: 2374.19 - lr: 0.000006 - momentum: 0.000000 2023-10-13 16:44:23,796 epoch 9 - iter 1470/1476 - loss 0.00997719 - time (sec): 69.67 - samples/sec: 2376.32 - lr: 0.000006 - momentum: 0.000000 2023-10-13 16:44:24,083 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:44:24,083 EPOCH 9 done: loss 0.0100 - lr: 0.000006 2023-10-13 16:44:35,753 DEV : loss 0.25528380274772644 - f1-score (micro avg) 0.8021 2023-10-13 16:44:35,782 saving best model 2023-10-13 16:44:36,372 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:44:43,685 epoch 10 - iter 147/1476 - loss 0.00753215 - time (sec): 7.31 - samples/sec: 2427.92 - lr: 0.000005 - momentum: 0.000000 2023-10-13 16:44:50,427 epoch 10 - iter 294/1476 - loss 0.00733874 - time (sec): 14.05 - samples/sec: 2397.65 - lr: 0.000004 - momentum: 0.000000 2023-10-13 16:44:57,073 epoch 10 - iter 441/1476 - loss 0.00807781 - time (sec): 20.70 - samples/sec: 2379.64 - lr: 0.000004 - momentum: 0.000000 2023-10-13 16:45:04,012 epoch 10 - iter 588/1476 - loss 0.00716263 - time (sec): 27.63 - samples/sec: 2380.05 - lr: 0.000003 - momentum: 0.000000 2023-10-13 16:45:10,975 epoch 10 - iter 735/1476 - loss 0.00686967 - time (sec): 34.60 - samples/sec: 2365.22 - lr: 0.000003 - momentum: 0.000000 2023-10-13 16:45:18,281 epoch 10 - iter 882/1476 - loss 0.00663991 - time (sec): 41.90 - samples/sec: 2403.12 - lr: 0.000002 - momentum: 0.000000 2023-10-13 16:45:24,959 epoch 10 - iter 1029/1476 - loss 0.00626594 - time (sec): 48.58 - samples/sec: 2376.51 - lr: 0.000002 - momentum: 0.000000 2023-10-13 16:45:32,198 epoch 10 - iter 1176/1476 - loss 0.00672382 - time (sec): 55.82 - samples/sec: 2373.49 - lr: 0.000001 - momentum: 0.000000 2023-10-13 16:45:39,249 epoch 10 - iter 1323/1476 - loss 0.00733306 - time (sec): 62.87 - samples/sec: 2376.93 - lr: 0.000001 - momentum: 0.000000 2023-10-13 16:45:46,111 epoch 10 - iter 1470/1476 - loss 0.00668996 - time (sec): 69.73 - samples/sec: 2378.72 - lr: 0.000000 - momentum: 0.000000 2023-10-13 16:45:46,370 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:45:46,370 EPOCH 10 done: loss 0.0067 - lr: 0.000000 2023-10-13 16:45:57,519 DEV : loss 0.2591579258441925 - f1-score (micro avg) 0.8087 2023-10-13 16:45:57,551 saving best model 2023-10-13 16:45:58,550 ---------------------------------------------------------------------------------------------------- 2023-10-13 16:45:58,552 Loading model from best epoch ... 2023-10-13 16:46:00,002 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod 2023-10-13 16:46:05,936 Results: - F-score (micro) 0.7887 - F-score (macro) 0.6932 - Accuracy 0.6774 By class: precision recall f1-score support loc 0.8380 0.8800 0.8584 858 pers 0.7456 0.7858 0.7652 537 org 0.5489 0.5530 0.5509 132 time 0.5373 0.6667 0.5950 54 prod 0.7647 0.6393 0.6964 61 micro avg 0.7712 0.8069 0.7887 1642 macro avg 0.6869 0.7050 0.6932 1642 weighted avg 0.7719 0.8069 0.7885 1642 2023-10-13 16:46:05,936 ----------------------------------------------------------------------------------------------------