stefan-it's picture
Upload folder using huggingface_hub
89c5c67
2023-10-13 16:32:27,478 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,479 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=21, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 16:32:27,479 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 MultiCorpus: 5901 train + 1287 dev + 1505 test sentences
- NER_HIPE_2022 Corpus: 5901 train + 1287 dev + 1505 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/fr/with_doc_seperator
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Train: 5901 sentences
2023-10-13 16:32:27,480 (train_with_dev=False, train_with_test=False)
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Training Params:
2023-10-13 16:32:27,480 - learning_rate: "5e-05"
2023-10-13 16:32:27,480 - mini_batch_size: "4"
2023-10-13 16:32:27,480 - max_epochs: "10"
2023-10-13 16:32:27,480 - shuffle: "True"
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Plugins:
2023-10-13 16:32:27,480 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 16:32:27,480 - metric: "('micro avg', 'f1-score')"
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Computation:
2023-10-13 16:32:27,480 - compute on device: cuda:0
2023-10-13 16:32:27,480 - embedding storage: none
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 Model training base path: "hmbench-hipe2020/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:27,480 ----------------------------------------------------------------------------------------------------
2023-10-13 16:32:34,656 epoch 1 - iter 147/1476 - loss 2.19264043 - time (sec): 7.17 - samples/sec: 2468.65 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:32:41,887 epoch 1 - iter 294/1476 - loss 1.34715324 - time (sec): 14.41 - samples/sec: 2494.63 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:32:48,820 epoch 1 - iter 441/1476 - loss 1.03252218 - time (sec): 21.34 - samples/sec: 2436.06 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:32:55,839 epoch 1 - iter 588/1476 - loss 0.84618778 - time (sec): 28.36 - samples/sec: 2423.52 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:33:03,052 epoch 1 - iter 735/1476 - loss 0.73617883 - time (sec): 35.57 - samples/sec: 2406.00 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:33:10,075 epoch 1 - iter 882/1476 - loss 0.65221352 - time (sec): 42.59 - samples/sec: 2401.55 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:33:16,986 epoch 1 - iter 1029/1476 - loss 0.59171549 - time (sec): 49.50 - samples/sec: 2395.74 - lr: 0.000035 - momentum: 0.000000
2023-10-13 16:33:23,748 epoch 1 - iter 1176/1476 - loss 0.54902197 - time (sec): 56.27 - samples/sec: 2370.02 - lr: 0.000040 - momentum: 0.000000
2023-10-13 16:33:30,731 epoch 1 - iter 1323/1476 - loss 0.50983066 - time (sec): 63.25 - samples/sec: 2366.30 - lr: 0.000045 - momentum: 0.000000
2023-10-13 16:33:37,571 epoch 1 - iter 1470/1476 - loss 0.47806023 - time (sec): 70.09 - samples/sec: 2366.46 - lr: 0.000050 - momentum: 0.000000
2023-10-13 16:33:37,841 ----------------------------------------------------------------------------------------------------
2023-10-13 16:33:37,841 EPOCH 1 done: loss 0.4771 - lr: 0.000050
2023-10-13 16:33:44,025 DEV : loss 0.14038856327533722 - f1-score (micro avg) 0.6914
2023-10-13 16:33:44,053 saving best model
2023-10-13 16:33:44,489 ----------------------------------------------------------------------------------------------------
2023-10-13 16:33:51,114 epoch 2 - iter 147/1476 - loss 0.13041987 - time (sec): 6.62 - samples/sec: 2298.32 - lr: 0.000049 - momentum: 0.000000
2023-10-13 16:33:57,959 epoch 2 - iter 294/1476 - loss 0.13920225 - time (sec): 13.47 - samples/sec: 2333.22 - lr: 0.000049 - momentum: 0.000000
2023-10-13 16:34:04,703 epoch 2 - iter 441/1476 - loss 0.13806486 - time (sec): 20.21 - samples/sec: 2363.50 - lr: 0.000048 - momentum: 0.000000
2023-10-13 16:34:11,574 epoch 2 - iter 588/1476 - loss 0.13521110 - time (sec): 27.08 - samples/sec: 2354.34 - lr: 0.000048 - momentum: 0.000000
2023-10-13 16:34:18,376 epoch 2 - iter 735/1476 - loss 0.14108629 - time (sec): 33.89 - samples/sec: 2331.03 - lr: 0.000047 - momentum: 0.000000
2023-10-13 16:34:25,372 epoch 2 - iter 882/1476 - loss 0.13917247 - time (sec): 40.88 - samples/sec: 2349.90 - lr: 0.000047 - momentum: 0.000000
2023-10-13 16:34:32,628 epoch 2 - iter 1029/1476 - loss 0.13722354 - time (sec): 48.14 - samples/sec: 2377.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 16:34:39,515 epoch 2 - iter 1176/1476 - loss 0.13237447 - time (sec): 55.02 - samples/sec: 2379.51 - lr: 0.000046 - momentum: 0.000000
2023-10-13 16:34:46,566 epoch 2 - iter 1323/1476 - loss 0.13528695 - time (sec): 62.08 - samples/sec: 2384.90 - lr: 0.000045 - momentum: 0.000000
2023-10-13 16:34:53,726 epoch 2 - iter 1470/1476 - loss 0.13631749 - time (sec): 69.24 - samples/sec: 2391.96 - lr: 0.000044 - momentum: 0.000000
2023-10-13 16:34:54,002 ----------------------------------------------------------------------------------------------------
2023-10-13 16:34:54,002 EPOCH 2 done: loss 0.1364 - lr: 0.000044
2023-10-13 16:35:05,260 DEV : loss 0.14942534267902374 - f1-score (micro avg) 0.7366
2023-10-13 16:35:05,289 saving best model
2023-10-13 16:35:05,779 ----------------------------------------------------------------------------------------------------
2023-10-13 16:35:12,635 epoch 3 - iter 147/1476 - loss 0.07591353 - time (sec): 6.85 - samples/sec: 2265.56 - lr: 0.000044 - momentum: 0.000000
2023-10-13 16:35:19,528 epoch 3 - iter 294/1476 - loss 0.09227817 - time (sec): 13.74 - samples/sec: 2354.13 - lr: 0.000043 - momentum: 0.000000
2023-10-13 16:35:26,354 epoch 3 - iter 441/1476 - loss 0.09626614 - time (sec): 20.57 - samples/sec: 2363.80 - lr: 0.000043 - momentum: 0.000000
2023-10-13 16:35:33,095 epoch 3 - iter 588/1476 - loss 0.09646343 - time (sec): 27.31 - samples/sec: 2347.82 - lr: 0.000042 - momentum: 0.000000
2023-10-13 16:35:40,240 epoch 3 - iter 735/1476 - loss 0.09416145 - time (sec): 34.45 - samples/sec: 2367.35 - lr: 0.000042 - momentum: 0.000000
2023-10-13 16:35:47,371 epoch 3 - iter 882/1476 - loss 0.09456799 - time (sec): 41.59 - samples/sec: 2406.31 - lr: 0.000041 - momentum: 0.000000
2023-10-13 16:35:54,299 epoch 3 - iter 1029/1476 - loss 0.09149587 - time (sec): 48.51 - samples/sec: 2393.68 - lr: 0.000041 - momentum: 0.000000
2023-10-13 16:36:01,363 epoch 3 - iter 1176/1476 - loss 0.09190785 - time (sec): 55.58 - samples/sec: 2410.90 - lr: 0.000040 - momentum: 0.000000
2023-10-13 16:36:08,187 epoch 3 - iter 1323/1476 - loss 0.09115192 - time (sec): 62.40 - samples/sec: 2407.82 - lr: 0.000039 - momentum: 0.000000
2023-10-13 16:36:15,108 epoch 3 - iter 1470/1476 - loss 0.09131695 - time (sec): 69.32 - samples/sec: 2391.98 - lr: 0.000039 - momentum: 0.000000
2023-10-13 16:36:15,376 ----------------------------------------------------------------------------------------------------
2023-10-13 16:36:15,376 EPOCH 3 done: loss 0.0912 - lr: 0.000039
2023-10-13 16:36:26,447 DEV : loss 0.17109665274620056 - f1-score (micro avg) 0.7761
2023-10-13 16:36:26,476 saving best model
2023-10-13 16:36:27,050 ----------------------------------------------------------------------------------------------------
2023-10-13 16:36:33,918 epoch 4 - iter 147/1476 - loss 0.06448851 - time (sec): 6.86 - samples/sec: 2220.09 - lr: 0.000038 - momentum: 0.000000
2023-10-13 16:36:40,610 epoch 4 - iter 294/1476 - loss 0.05935983 - time (sec): 13.56 - samples/sec: 2294.42 - lr: 0.000038 - momentum: 0.000000
2023-10-13 16:36:47,491 epoch 4 - iter 441/1476 - loss 0.06577537 - time (sec): 20.44 - samples/sec: 2339.67 - lr: 0.000037 - momentum: 0.000000
2023-10-13 16:36:54,183 epoch 4 - iter 588/1476 - loss 0.06370900 - time (sec): 27.13 - samples/sec: 2330.56 - lr: 0.000037 - momentum: 0.000000
2023-10-13 16:37:01,145 epoch 4 - iter 735/1476 - loss 0.06430938 - time (sec): 34.09 - samples/sec: 2344.21 - lr: 0.000036 - momentum: 0.000000
2023-10-13 16:37:08,310 epoch 4 - iter 882/1476 - loss 0.06262401 - time (sec): 41.26 - samples/sec: 2361.23 - lr: 0.000036 - momentum: 0.000000
2023-10-13 16:37:15,650 epoch 4 - iter 1029/1476 - loss 0.06226410 - time (sec): 48.60 - samples/sec: 2394.58 - lr: 0.000035 - momentum: 0.000000
2023-10-13 16:37:22,567 epoch 4 - iter 1176/1476 - loss 0.06322309 - time (sec): 55.51 - samples/sec: 2395.17 - lr: 0.000034 - momentum: 0.000000
2023-10-13 16:37:29,580 epoch 4 - iter 1323/1476 - loss 0.06623547 - time (sec): 62.53 - samples/sec: 2394.70 - lr: 0.000034 - momentum: 0.000000
2023-10-13 16:37:36,240 epoch 4 - iter 1470/1476 - loss 0.06439991 - time (sec): 69.18 - samples/sec: 2395.39 - lr: 0.000033 - momentum: 0.000000
2023-10-13 16:37:36,522 ----------------------------------------------------------------------------------------------------
2023-10-13 16:37:36,522 EPOCH 4 done: loss 0.0647 - lr: 0.000033
2023-10-13 16:37:47,651 DEV : loss 0.22011879086494446 - f1-score (micro avg) 0.7734
2023-10-13 16:37:47,680 ----------------------------------------------------------------------------------------------------
2023-10-13 16:37:54,663 epoch 5 - iter 147/1476 - loss 0.05406225 - time (sec): 6.98 - samples/sec: 2371.22 - lr: 0.000033 - momentum: 0.000000
2023-10-13 16:38:02,070 epoch 5 - iter 294/1476 - loss 0.05659229 - time (sec): 14.39 - samples/sec: 2313.32 - lr: 0.000032 - momentum: 0.000000
2023-10-13 16:38:09,156 epoch 5 - iter 441/1476 - loss 0.05185982 - time (sec): 21.48 - samples/sec: 2344.21 - lr: 0.000032 - momentum: 0.000000
2023-10-13 16:38:15,969 epoch 5 - iter 588/1476 - loss 0.05025944 - time (sec): 28.29 - samples/sec: 2347.40 - lr: 0.000031 - momentum: 0.000000
2023-10-13 16:38:22,835 epoch 5 - iter 735/1476 - loss 0.04794011 - time (sec): 35.15 - samples/sec: 2348.11 - lr: 0.000031 - momentum: 0.000000
2023-10-13 16:38:29,650 epoch 5 - iter 882/1476 - loss 0.04879169 - time (sec): 41.97 - samples/sec: 2336.67 - lr: 0.000030 - momentum: 0.000000
2023-10-13 16:38:36,556 epoch 5 - iter 1029/1476 - loss 0.04784084 - time (sec): 48.88 - samples/sec: 2340.55 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:38:43,817 epoch 5 - iter 1176/1476 - loss 0.04785536 - time (sec): 56.14 - samples/sec: 2366.35 - lr: 0.000029 - momentum: 0.000000
2023-10-13 16:38:50,949 epoch 5 - iter 1323/1476 - loss 0.04673364 - time (sec): 63.27 - samples/sec: 2382.16 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:38:57,668 epoch 5 - iter 1470/1476 - loss 0.04668226 - time (sec): 69.99 - samples/sec: 2369.74 - lr: 0.000028 - momentum: 0.000000
2023-10-13 16:38:57,926 ----------------------------------------------------------------------------------------------------
2023-10-13 16:38:57,927 EPOCH 5 done: loss 0.0466 - lr: 0.000028
2023-10-13 16:39:09,057 DEV : loss 0.18591712415218353 - f1-score (micro avg) 0.7918
2023-10-13 16:39:09,086 saving best model
2023-10-13 16:39:09,678 ----------------------------------------------------------------------------------------------------
2023-10-13 16:39:16,341 epoch 6 - iter 147/1476 - loss 0.02523404 - time (sec): 6.66 - samples/sec: 2241.56 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:39:23,617 epoch 6 - iter 294/1476 - loss 0.02890210 - time (sec): 13.94 - samples/sec: 2449.63 - lr: 0.000027 - momentum: 0.000000
2023-10-13 16:39:30,543 epoch 6 - iter 441/1476 - loss 0.03369659 - time (sec): 20.86 - samples/sec: 2447.46 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:39:37,495 epoch 6 - iter 588/1476 - loss 0.03450237 - time (sec): 27.81 - samples/sec: 2424.06 - lr: 0.000026 - momentum: 0.000000
2023-10-13 16:39:44,573 epoch 6 - iter 735/1476 - loss 0.03447603 - time (sec): 34.89 - samples/sec: 2434.04 - lr: 0.000025 - momentum: 0.000000
2023-10-13 16:39:51,749 epoch 6 - iter 882/1476 - loss 0.03728195 - time (sec): 42.07 - samples/sec: 2428.75 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:39:58,407 epoch 6 - iter 1029/1476 - loss 0.03609371 - time (sec): 48.73 - samples/sec: 2409.30 - lr: 0.000024 - momentum: 0.000000
2023-10-13 16:40:05,348 epoch 6 - iter 1176/1476 - loss 0.03396739 - time (sec): 55.67 - samples/sec: 2406.36 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:40:12,191 epoch 6 - iter 1323/1476 - loss 0.03505791 - time (sec): 62.51 - samples/sec: 2396.41 - lr: 0.000023 - momentum: 0.000000
2023-10-13 16:40:19,135 epoch 6 - iter 1470/1476 - loss 0.03535902 - time (sec): 69.45 - samples/sec: 2389.64 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:40:19,401 ----------------------------------------------------------------------------------------------------
2023-10-13 16:40:19,401 EPOCH 6 done: loss 0.0353 - lr: 0.000022
2023-10-13 16:40:30,593 DEV : loss 0.20590120553970337 - f1-score (micro avg) 0.7935
2023-10-13 16:40:30,622 saving best model
2023-10-13 16:40:31,204 ----------------------------------------------------------------------------------------------------
2023-10-13 16:40:38,565 epoch 7 - iter 147/1476 - loss 0.01875450 - time (sec): 7.36 - samples/sec: 2324.91 - lr: 0.000022 - momentum: 0.000000
2023-10-13 16:40:45,309 epoch 7 - iter 294/1476 - loss 0.01827635 - time (sec): 14.10 - samples/sec: 2279.02 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:40:52,625 epoch 7 - iter 441/1476 - loss 0.02228262 - time (sec): 21.42 - samples/sec: 2360.36 - lr: 0.000021 - momentum: 0.000000
2023-10-13 16:41:00,112 epoch 7 - iter 588/1476 - loss 0.02120762 - time (sec): 28.91 - samples/sec: 2398.25 - lr: 0.000020 - momentum: 0.000000
2023-10-13 16:41:06,805 epoch 7 - iter 735/1476 - loss 0.02126680 - time (sec): 35.60 - samples/sec: 2370.88 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:41:13,433 epoch 7 - iter 882/1476 - loss 0.02140474 - time (sec): 42.23 - samples/sec: 2362.59 - lr: 0.000019 - momentum: 0.000000
2023-10-13 16:41:19,855 epoch 7 - iter 1029/1476 - loss 0.02223445 - time (sec): 48.65 - samples/sec: 2381.63 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:41:26,472 epoch 7 - iter 1176/1476 - loss 0.02219911 - time (sec): 55.26 - samples/sec: 2380.18 - lr: 0.000018 - momentum: 0.000000
2023-10-13 16:41:33,320 epoch 7 - iter 1323/1476 - loss 0.02216569 - time (sec): 62.11 - samples/sec: 2376.25 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:41:40,533 epoch 7 - iter 1470/1476 - loss 0.02249640 - time (sec): 69.33 - samples/sec: 2392.97 - lr: 0.000017 - momentum: 0.000000
2023-10-13 16:41:40,788 ----------------------------------------------------------------------------------------------------
2023-10-13 16:41:40,788 EPOCH 7 done: loss 0.0226 - lr: 0.000017
2023-10-13 16:41:51,974 DEV : loss 0.23674637079238892 - f1-score (micro avg) 0.7971
2023-10-13 16:41:52,003 saving best model
2023-10-13 16:41:52,500 ----------------------------------------------------------------------------------------------------
2023-10-13 16:41:59,567 epoch 8 - iter 147/1476 - loss 0.01813003 - time (sec): 7.07 - samples/sec: 2377.41 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:42:06,585 epoch 8 - iter 294/1476 - loss 0.01758301 - time (sec): 14.08 - samples/sec: 2397.59 - lr: 0.000016 - momentum: 0.000000
2023-10-13 16:42:13,868 epoch 8 - iter 441/1476 - loss 0.02116504 - time (sec): 21.37 - samples/sec: 2488.85 - lr: 0.000015 - momentum: 0.000000
2023-10-13 16:42:20,522 epoch 8 - iter 588/1476 - loss 0.02109938 - time (sec): 28.02 - samples/sec: 2426.58 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:42:27,485 epoch 8 - iter 735/1476 - loss 0.01950654 - time (sec): 34.98 - samples/sec: 2394.96 - lr: 0.000014 - momentum: 0.000000
2023-10-13 16:42:34,305 epoch 8 - iter 882/1476 - loss 0.01842251 - time (sec): 41.80 - samples/sec: 2372.96 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:42:41,420 epoch 8 - iter 1029/1476 - loss 0.01674893 - time (sec): 48.92 - samples/sec: 2354.63 - lr: 0.000013 - momentum: 0.000000
2023-10-13 16:42:48,112 epoch 8 - iter 1176/1476 - loss 0.01663061 - time (sec): 55.61 - samples/sec: 2343.53 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:42:55,261 epoch 8 - iter 1323/1476 - loss 0.01632528 - time (sec): 62.76 - samples/sec: 2359.37 - lr: 0.000012 - momentum: 0.000000
2023-10-13 16:43:02,148 epoch 8 - iter 1470/1476 - loss 0.01581879 - time (sec): 69.65 - samples/sec: 2380.38 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:43:02,409 ----------------------------------------------------------------------------------------------------
2023-10-13 16:43:02,409 EPOCH 8 done: loss 0.0158 - lr: 0.000011
2023-10-13 16:43:13,521 DEV : loss 0.25357791781425476 - f1-score (micro avg) 0.7984
2023-10-13 16:43:13,550 saving best model
2023-10-13 16:43:14,117 ----------------------------------------------------------------------------------------------------
2023-10-13 16:43:21,014 epoch 9 - iter 147/1476 - loss 0.00927471 - time (sec): 6.89 - samples/sec: 2259.68 - lr: 0.000011 - momentum: 0.000000
2023-10-13 16:43:28,062 epoch 9 - iter 294/1476 - loss 0.00809657 - time (sec): 13.94 - samples/sec: 2364.85 - lr: 0.000010 - momentum: 0.000000
2023-10-13 16:43:34,763 epoch 9 - iter 441/1476 - loss 0.00782182 - time (sec): 20.64 - samples/sec: 2353.66 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:43:41,562 epoch 9 - iter 588/1476 - loss 0.00861677 - time (sec): 27.44 - samples/sec: 2378.37 - lr: 0.000009 - momentum: 0.000000
2023-10-13 16:43:48,725 epoch 9 - iter 735/1476 - loss 0.00955088 - time (sec): 34.60 - samples/sec: 2408.79 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:43:55,520 epoch 9 - iter 882/1476 - loss 0.00855623 - time (sec): 41.40 - samples/sec: 2391.02 - lr: 0.000008 - momentum: 0.000000
2023-10-13 16:44:02,540 epoch 9 - iter 1029/1476 - loss 0.00809094 - time (sec): 48.42 - samples/sec: 2397.34 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:44:09,395 epoch 9 - iter 1176/1476 - loss 0.00852934 - time (sec): 55.27 - samples/sec: 2381.55 - lr: 0.000007 - momentum: 0.000000
2023-10-13 16:44:16,493 epoch 9 - iter 1323/1476 - loss 0.00844392 - time (sec): 62.37 - samples/sec: 2374.19 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:44:23,796 epoch 9 - iter 1470/1476 - loss 0.00997719 - time (sec): 69.67 - samples/sec: 2376.32 - lr: 0.000006 - momentum: 0.000000
2023-10-13 16:44:24,083 ----------------------------------------------------------------------------------------------------
2023-10-13 16:44:24,083 EPOCH 9 done: loss 0.0100 - lr: 0.000006
2023-10-13 16:44:35,753 DEV : loss 0.25528380274772644 - f1-score (micro avg) 0.8021
2023-10-13 16:44:35,782 saving best model
2023-10-13 16:44:36,372 ----------------------------------------------------------------------------------------------------
2023-10-13 16:44:43,685 epoch 10 - iter 147/1476 - loss 0.00753215 - time (sec): 7.31 - samples/sec: 2427.92 - lr: 0.000005 - momentum: 0.000000
2023-10-13 16:44:50,427 epoch 10 - iter 294/1476 - loss 0.00733874 - time (sec): 14.05 - samples/sec: 2397.65 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:44:57,073 epoch 10 - iter 441/1476 - loss 0.00807781 - time (sec): 20.70 - samples/sec: 2379.64 - lr: 0.000004 - momentum: 0.000000
2023-10-13 16:45:04,012 epoch 10 - iter 588/1476 - loss 0.00716263 - time (sec): 27.63 - samples/sec: 2380.05 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:45:10,975 epoch 10 - iter 735/1476 - loss 0.00686967 - time (sec): 34.60 - samples/sec: 2365.22 - lr: 0.000003 - momentum: 0.000000
2023-10-13 16:45:18,281 epoch 10 - iter 882/1476 - loss 0.00663991 - time (sec): 41.90 - samples/sec: 2403.12 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:45:24,959 epoch 10 - iter 1029/1476 - loss 0.00626594 - time (sec): 48.58 - samples/sec: 2376.51 - lr: 0.000002 - momentum: 0.000000
2023-10-13 16:45:32,198 epoch 10 - iter 1176/1476 - loss 0.00672382 - time (sec): 55.82 - samples/sec: 2373.49 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:45:39,249 epoch 10 - iter 1323/1476 - loss 0.00733306 - time (sec): 62.87 - samples/sec: 2376.93 - lr: 0.000001 - momentum: 0.000000
2023-10-13 16:45:46,111 epoch 10 - iter 1470/1476 - loss 0.00668996 - time (sec): 69.73 - samples/sec: 2378.72 - lr: 0.000000 - momentum: 0.000000
2023-10-13 16:45:46,370 ----------------------------------------------------------------------------------------------------
2023-10-13 16:45:46,370 EPOCH 10 done: loss 0.0067 - lr: 0.000000
2023-10-13 16:45:57,519 DEV : loss 0.2591579258441925 - f1-score (micro avg) 0.8087
2023-10-13 16:45:57,551 saving best model
2023-10-13 16:45:58,550 ----------------------------------------------------------------------------------------------------
2023-10-13 16:45:58,552 Loading model from best epoch ...
2023-10-13 16:46:00,002 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-time, B-time, E-time, I-time, S-prod, B-prod, E-prod, I-prod
2023-10-13 16:46:05,936
Results:
- F-score (micro) 0.7887
- F-score (macro) 0.6932
- Accuracy 0.6774
By class:
precision recall f1-score support
loc 0.8380 0.8800 0.8584 858
pers 0.7456 0.7858 0.7652 537
org 0.5489 0.5530 0.5509 132
time 0.5373 0.6667 0.5950 54
prod 0.7647 0.6393 0.6964 61
micro avg 0.7712 0.8069 0.7887 1642
macro avg 0.6869 0.7050 0.6932 1642
weighted avg 0.7719 0.8069 0.7885 1642
2023-10-13 16:46:05,936 ----------------------------------------------------------------------------------------------------