2023-10-14 23:34:14,669 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,669 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences - NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Train: 14465 sentences 2023-10-14 23:34:14,670 (train_with_dev=False, train_with_test=False) 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Training Params: 2023-10-14 23:34:14,670 - learning_rate: "3e-05" 2023-10-14 23:34:14,670 - mini_batch_size: "4" 2023-10-14 23:34:14,670 - max_epochs: "10" 2023-10-14 23:34:14,670 - shuffle: "True" 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Plugins: 2023-10-14 23:34:14,670 - LinearScheduler | warmup_fraction: '0.1' 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Final evaluation on model from best epoch (best-model.pt) 2023-10-14 23:34:14,670 - metric: "('micro avg', 'f1-score')" 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Computation: 2023-10-14 23:34:14,670 - compute on device: cuda:0 2023-10-14 23:34:14,670 - embedding storage: none 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:14,670 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:34:31,234 epoch 1 - iter 361/3617 - loss 1.36163379 - time (sec): 16.56 - samples/sec: 2286.03 - lr: 0.000003 - momentum: 0.000000 2023-10-14 23:34:47,495 epoch 1 - iter 722/3617 - loss 0.78535193 - time (sec): 32.82 - samples/sec: 2288.96 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:35:06,592 epoch 1 - iter 1083/3617 - loss 0.57914316 - time (sec): 51.92 - samples/sec: 2183.91 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:35:23,708 epoch 1 - iter 1444/3617 - loss 0.47133260 - time (sec): 69.04 - samples/sec: 2183.38 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:35:40,204 epoch 1 - iter 1805/3617 - loss 0.40334999 - time (sec): 85.53 - samples/sec: 2212.25 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:35:56,791 epoch 1 - iter 2166/3617 - loss 0.35690213 - time (sec): 102.12 - samples/sec: 2222.98 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:36:13,063 epoch 1 - iter 2527/3617 - loss 0.32256725 - time (sec): 118.39 - samples/sec: 2230.09 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:36:31,390 epoch 1 - iter 2888/3617 - loss 0.29552514 - time (sec): 136.72 - samples/sec: 2208.50 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:36:48,635 epoch 1 - iter 3249/3617 - loss 0.27457449 - time (sec): 153.96 - samples/sec: 2214.58 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:37:07,987 epoch 1 - iter 3610/3617 - loss 0.25799655 - time (sec): 173.32 - samples/sec: 2188.26 - lr: 0.000030 - momentum: 0.000000 2023-10-14 23:37:08,344 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:37:08,344 EPOCH 1 done: loss 0.2577 - lr: 0.000030 2023-10-14 23:37:13,548 DEV : loss 0.1281796246767044 - f1-score (micro avg) 0.5899 2023-10-14 23:37:13,589 saving best model 2023-10-14 23:37:14,042 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:37:31,572 epoch 2 - iter 361/3617 - loss 0.10519861 - time (sec): 17.53 - samples/sec: 2217.47 - lr: 0.000030 - momentum: 0.000000 2023-10-14 23:37:50,569 epoch 2 - iter 722/3617 - loss 0.10049883 - time (sec): 36.53 - samples/sec: 2096.09 - lr: 0.000029 - momentum: 0.000000 2023-10-14 23:38:07,284 epoch 2 - iter 1083/3617 - loss 0.10110167 - time (sec): 53.24 - samples/sec: 2145.24 - lr: 0.000029 - momentum: 0.000000 2023-10-14 23:38:23,594 epoch 2 - iter 1444/3617 - loss 0.09863041 - time (sec): 69.55 - samples/sec: 2200.65 - lr: 0.000029 - momentum: 0.000000 2023-10-14 23:38:41,303 epoch 2 - iter 1805/3617 - loss 0.10015583 - time (sec): 87.26 - samples/sec: 2188.65 - lr: 0.000028 - momentum: 0.000000 2023-10-14 23:38:59,829 epoch 2 - iter 2166/3617 - loss 0.09892596 - time (sec): 105.79 - samples/sec: 2178.56 - lr: 0.000028 - momentum: 0.000000 2023-10-14 23:39:16,761 epoch 2 - iter 2527/3617 - loss 0.10000026 - time (sec): 122.72 - samples/sec: 2180.19 - lr: 0.000028 - momentum: 0.000000 2023-10-14 23:39:34,633 epoch 2 - iter 2888/3617 - loss 0.09831535 - time (sec): 140.59 - samples/sec: 2164.63 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:39:51,700 epoch 2 - iter 3249/3617 - loss 0.09854326 - time (sec): 157.66 - samples/sec: 2170.25 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:40:11,440 epoch 2 - iter 3610/3617 - loss 0.09811914 - time (sec): 177.40 - samples/sec: 2138.64 - lr: 0.000027 - momentum: 0.000000 2023-10-14 23:40:11,809 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:40:11,809 EPOCH 2 done: loss 0.0981 - lr: 0.000027 2023-10-14 23:40:18,793 DEV : loss 0.1390405148267746 - f1-score (micro avg) 0.6624 2023-10-14 23:40:18,825 saving best model 2023-10-14 23:40:19,506 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:40:36,393 epoch 3 - iter 361/3617 - loss 0.06299067 - time (sec): 16.88 - samples/sec: 2243.14 - lr: 0.000026 - momentum: 0.000000 2023-10-14 23:40:52,924 epoch 3 - iter 722/3617 - loss 0.06908990 - time (sec): 33.42 - samples/sec: 2302.91 - lr: 0.000026 - momentum: 0.000000 2023-10-14 23:41:09,158 epoch 3 - iter 1083/3617 - loss 0.06892332 - time (sec): 49.65 - samples/sec: 2289.47 - lr: 0.000026 - momentum: 0.000000 2023-10-14 23:41:25,608 epoch 3 - iter 1444/3617 - loss 0.07129677 - time (sec): 66.10 - samples/sec: 2289.03 - lr: 0.000025 - momentum: 0.000000 2023-10-14 23:41:41,972 epoch 3 - iter 1805/3617 - loss 0.07081905 - time (sec): 82.46 - samples/sec: 2275.35 - lr: 0.000025 - momentum: 0.000000 2023-10-14 23:41:58,328 epoch 3 - iter 2166/3617 - loss 0.07078356 - time (sec): 98.82 - samples/sec: 2286.31 - lr: 0.000025 - momentum: 0.000000 2023-10-14 23:42:14,621 epoch 3 - iter 2527/3617 - loss 0.07071235 - time (sec): 115.11 - samples/sec: 2294.00 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:42:30,794 epoch 3 - iter 2888/3617 - loss 0.07182890 - time (sec): 131.28 - samples/sec: 2301.27 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:42:47,165 epoch 3 - iter 3249/3617 - loss 0.07170442 - time (sec): 147.66 - samples/sec: 2299.57 - lr: 0.000024 - momentum: 0.000000 2023-10-14 23:43:03,880 epoch 3 - iter 3610/3617 - loss 0.07109717 - time (sec): 164.37 - samples/sec: 2308.22 - lr: 0.000023 - momentum: 0.000000 2023-10-14 23:43:04,189 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:43:04,189 EPOCH 3 done: loss 0.0713 - lr: 0.000023 2023-10-14 23:43:10,654 DEV : loss 0.19752231240272522 - f1-score (micro avg) 0.6235 2023-10-14 23:43:10,698 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:43:27,930 epoch 4 - iter 361/3617 - loss 0.04436168 - time (sec): 17.23 - samples/sec: 2202.46 - lr: 0.000023 - momentum: 0.000000 2023-10-14 23:43:44,435 epoch 4 - iter 722/3617 - loss 0.04585262 - time (sec): 33.73 - samples/sec: 2218.87 - lr: 0.000023 - momentum: 0.000000 2023-10-14 23:44:01,096 epoch 4 - iter 1083/3617 - loss 0.04921851 - time (sec): 50.40 - samples/sec: 2279.70 - lr: 0.000022 - momentum: 0.000000 2023-10-14 23:44:17,549 epoch 4 - iter 1444/3617 - loss 0.04934984 - time (sec): 66.85 - samples/sec: 2280.98 - lr: 0.000022 - momentum: 0.000000 2023-10-14 23:44:34,110 epoch 4 - iter 1805/3617 - loss 0.05175654 - time (sec): 83.41 - samples/sec: 2279.99 - lr: 0.000022 - momentum: 0.000000 2023-10-14 23:44:50,541 epoch 4 - iter 2166/3617 - loss 0.05020951 - time (sec): 99.84 - samples/sec: 2282.77 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:45:07,182 epoch 4 - iter 2527/3617 - loss 0.05084194 - time (sec): 116.48 - samples/sec: 2283.21 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:45:23,435 epoch 4 - iter 2888/3617 - loss 0.05193754 - time (sec): 132.74 - samples/sec: 2285.41 - lr: 0.000021 - momentum: 0.000000 2023-10-14 23:45:39,602 epoch 4 - iter 3249/3617 - loss 0.05165414 - time (sec): 148.90 - samples/sec: 2291.14 - lr: 0.000020 - momentum: 0.000000 2023-10-14 23:45:55,738 epoch 4 - iter 3610/3617 - loss 0.05169628 - time (sec): 165.04 - samples/sec: 2298.21 - lr: 0.000020 - momentum: 0.000000 2023-10-14 23:45:56,041 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:45:56,041 EPOCH 4 done: loss 0.0517 - lr: 0.000020 2023-10-14 23:46:02,662 DEV : loss 0.2679402828216553 - f1-score (micro avg) 0.6366 2023-10-14 23:46:02,698 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:46:19,607 epoch 5 - iter 361/3617 - loss 0.04298391 - time (sec): 16.91 - samples/sec: 2203.83 - lr: 0.000020 - momentum: 0.000000 2023-10-14 23:46:36,400 epoch 5 - iter 722/3617 - loss 0.03826667 - time (sec): 33.70 - samples/sec: 2257.95 - lr: 0.000019 - momentum: 0.000000 2023-10-14 23:46:52,804 epoch 5 - iter 1083/3617 - loss 0.03941655 - time (sec): 50.10 - samples/sec: 2289.34 - lr: 0.000019 - momentum: 0.000000 2023-10-14 23:47:09,226 epoch 5 - iter 1444/3617 - loss 0.03859460 - time (sec): 66.53 - samples/sec: 2288.55 - lr: 0.000019 - momentum: 0.000000 2023-10-14 23:47:25,683 epoch 5 - iter 1805/3617 - loss 0.03843288 - time (sec): 82.98 - samples/sec: 2296.59 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:47:41,999 epoch 5 - iter 2166/3617 - loss 0.03940779 - time (sec): 99.30 - samples/sec: 2306.54 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:47:58,428 epoch 5 - iter 2527/3617 - loss 0.03985879 - time (sec): 115.73 - samples/sec: 2320.92 - lr: 0.000018 - momentum: 0.000000 2023-10-14 23:48:14,698 epoch 5 - iter 2888/3617 - loss 0.03859164 - time (sec): 132.00 - samples/sec: 2318.43 - lr: 0.000017 - momentum: 0.000000 2023-10-14 23:48:31,109 epoch 5 - iter 3249/3617 - loss 0.03871626 - time (sec): 148.41 - samples/sec: 2309.97 - lr: 0.000017 - momentum: 0.000000 2023-10-14 23:48:47,292 epoch 5 - iter 3610/3617 - loss 0.03883470 - time (sec): 164.59 - samples/sec: 2304.32 - lr: 0.000017 - momentum: 0.000000 2023-10-14 23:48:47,598 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:48:47,598 EPOCH 5 done: loss 0.0388 - lr: 0.000017 2023-10-14 23:48:53,945 DEV : loss 0.2992906868457794 - f1-score (micro avg) 0.6196 2023-10-14 23:48:53,975 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:49:10,350 epoch 6 - iter 361/3617 - loss 0.02858051 - time (sec): 16.37 - samples/sec: 2293.30 - lr: 0.000016 - momentum: 0.000000 2023-10-14 23:49:26,688 epoch 6 - iter 722/3617 - loss 0.03008292 - time (sec): 32.71 - samples/sec: 2325.10 - lr: 0.000016 - momentum: 0.000000 2023-10-14 23:49:43,019 epoch 6 - iter 1083/3617 - loss 0.03008366 - time (sec): 49.04 - samples/sec: 2324.89 - lr: 0.000016 - momentum: 0.000000 2023-10-14 23:49:59,394 epoch 6 - iter 1444/3617 - loss 0.02765207 - time (sec): 65.42 - samples/sec: 2337.01 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:50:15,811 epoch 6 - iter 1805/3617 - loss 0.02577909 - time (sec): 81.83 - samples/sec: 2323.08 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:50:32,251 epoch 6 - iter 2166/3617 - loss 0.02580397 - time (sec): 98.27 - samples/sec: 2317.84 - lr: 0.000015 - momentum: 0.000000 2023-10-14 23:50:49,652 epoch 6 - iter 2527/3617 - loss 0.02703633 - time (sec): 115.67 - samples/sec: 2292.76 - lr: 0.000014 - momentum: 0.000000 2023-10-14 23:51:05,959 epoch 6 - iter 2888/3617 - loss 0.02642517 - time (sec): 131.98 - samples/sec: 2295.62 - lr: 0.000014 - momentum: 0.000000 2023-10-14 23:51:22,251 epoch 6 - iter 3249/3617 - loss 0.02767332 - time (sec): 148.27 - samples/sec: 2302.85 - lr: 0.000014 - momentum: 0.000000 2023-10-14 23:51:38,863 epoch 6 - iter 3610/3617 - loss 0.02746834 - time (sec): 164.89 - samples/sec: 2299.21 - lr: 0.000013 - momentum: 0.000000 2023-10-14 23:51:39,164 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:51:39,165 EPOCH 6 done: loss 0.0275 - lr: 0.000013 2023-10-14 23:51:44,769 DEV : loss 0.31786689162254333 - f1-score (micro avg) 0.6355 2023-10-14 23:51:44,815 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:52:01,511 epoch 7 - iter 361/3617 - loss 0.01641665 - time (sec): 16.69 - samples/sec: 2222.47 - lr: 0.000013 - momentum: 0.000000 2023-10-14 23:52:17,844 epoch 7 - iter 722/3617 - loss 0.01652455 - time (sec): 33.03 - samples/sec: 2229.06 - lr: 0.000013 - momentum: 0.000000 2023-10-14 23:52:34,199 epoch 7 - iter 1083/3617 - loss 0.01691783 - time (sec): 49.38 - samples/sec: 2267.82 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:52:50,534 epoch 7 - iter 1444/3617 - loss 0.01601712 - time (sec): 65.72 - samples/sec: 2269.87 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:53:06,889 epoch 7 - iter 1805/3617 - loss 0.01694897 - time (sec): 82.07 - samples/sec: 2282.71 - lr: 0.000012 - momentum: 0.000000 2023-10-14 23:53:22,796 epoch 7 - iter 2166/3617 - loss 0.01691405 - time (sec): 97.98 - samples/sec: 2303.14 - lr: 0.000011 - momentum: 0.000000 2023-10-14 23:53:38,736 epoch 7 - iter 2527/3617 - loss 0.01667359 - time (sec): 113.92 - samples/sec: 2325.32 - lr: 0.000011 - momentum: 0.000000 2023-10-14 23:53:54,461 epoch 7 - iter 2888/3617 - loss 0.01633718 - time (sec): 129.64 - samples/sec: 2328.61 - lr: 0.000011 - momentum: 0.000000 2023-10-14 23:54:10,751 epoch 7 - iter 3249/3617 - loss 0.01689256 - time (sec): 145.93 - samples/sec: 2345.45 - lr: 0.000010 - momentum: 0.000000 2023-10-14 23:54:29,301 epoch 7 - iter 3610/3617 - loss 0.01707300 - time (sec): 164.48 - samples/sec: 2306.07 - lr: 0.000010 - momentum: 0.000000 2023-10-14 23:54:29,666 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:54:29,666 EPOCH 7 done: loss 0.0171 - lr: 0.000010 2023-10-14 23:54:36,234 DEV : loss 0.34024110436439514 - f1-score (micro avg) 0.6445 2023-10-14 23:54:36,268 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:54:52,959 epoch 8 - iter 361/3617 - loss 0.00929665 - time (sec): 16.69 - samples/sec: 2233.85 - lr: 0.000010 - momentum: 0.000000 2023-10-14 23:55:09,344 epoch 8 - iter 722/3617 - loss 0.01095348 - time (sec): 33.07 - samples/sec: 2289.01 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:55:25,644 epoch 8 - iter 1083/3617 - loss 0.01097502 - time (sec): 49.37 - samples/sec: 2284.68 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:55:41,915 epoch 8 - iter 1444/3617 - loss 0.01199052 - time (sec): 65.65 - samples/sec: 2316.06 - lr: 0.000009 - momentum: 0.000000 2023-10-14 23:55:58,052 epoch 8 - iter 1805/3617 - loss 0.01155861 - time (sec): 81.78 - samples/sec: 2310.35 - lr: 0.000008 - momentum: 0.000000 2023-10-14 23:56:14,251 epoch 8 - iter 2166/3617 - loss 0.01161260 - time (sec): 97.98 - samples/sec: 2313.58 - lr: 0.000008 - momentum: 0.000000 2023-10-14 23:56:30,608 epoch 8 - iter 2527/3617 - loss 0.01230179 - time (sec): 114.34 - samples/sec: 2311.46 - lr: 0.000008 - momentum: 0.000000 2023-10-14 23:56:46,856 epoch 8 - iter 2888/3617 - loss 0.01231660 - time (sec): 130.59 - samples/sec: 2321.23 - lr: 0.000007 - momentum: 0.000000 2023-10-14 23:57:03,177 epoch 8 - iter 3249/3617 - loss 0.01206507 - time (sec): 146.91 - samples/sec: 2320.26 - lr: 0.000007 - momentum: 0.000000 2023-10-14 23:57:19,556 epoch 8 - iter 3610/3617 - loss 0.01178827 - time (sec): 163.29 - samples/sec: 2321.42 - lr: 0.000007 - momentum: 0.000000 2023-10-14 23:57:19,879 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:57:19,879 EPOCH 8 done: loss 0.0118 - lr: 0.000007 2023-10-14 23:57:26,350 DEV : loss 0.36208638548851013 - f1-score (micro avg) 0.6341 2023-10-14 23:57:26,383 ---------------------------------------------------------------------------------------------------- 2023-10-14 23:57:42,883 epoch 9 - iter 361/3617 - loss 0.01142158 - time (sec): 16.50 - samples/sec: 2295.97 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:57:59,357 epoch 9 - iter 722/3617 - loss 0.00738770 - time (sec): 32.97 - samples/sec: 2315.11 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:58:15,686 epoch 9 - iter 1083/3617 - loss 0.00754335 - time (sec): 49.30 - samples/sec: 2318.80 - lr: 0.000006 - momentum: 0.000000 2023-10-14 23:58:31,888 epoch 9 - iter 1444/3617 - loss 0.00919711 - time (sec): 65.50 - samples/sec: 2311.16 - lr: 0.000005 - momentum: 0.000000 2023-10-14 23:58:48,291 epoch 9 - iter 1805/3617 - loss 0.00809847 - time (sec): 81.91 - samples/sec: 2319.40 - lr: 0.000005 - momentum: 0.000000 2023-10-14 23:59:04,676 epoch 9 - iter 2166/3617 - loss 0.00725574 - time (sec): 98.29 - samples/sec: 2317.26 - lr: 0.000005 - momentum: 0.000000 2023-10-14 23:59:20,803 epoch 9 - iter 2527/3617 - loss 0.00767874 - time (sec): 114.42 - samples/sec: 2317.79 - lr: 0.000004 - momentum: 0.000000 2023-10-14 23:59:37,053 epoch 9 - iter 2888/3617 - loss 0.00829904 - time (sec): 130.67 - samples/sec: 2325.30 - lr: 0.000004 - momentum: 0.000000 2023-10-14 23:59:53,365 epoch 9 - iter 3249/3617 - loss 0.00828150 - time (sec): 146.98 - samples/sec: 2324.96 - lr: 0.000004 - momentum: 0.000000 2023-10-15 00:00:09,734 epoch 9 - iter 3610/3617 - loss 0.00879170 - time (sec): 163.35 - samples/sec: 2322.22 - lr: 0.000003 - momentum: 0.000000 2023-10-15 00:00:10,042 ---------------------------------------------------------------------------------------------------- 2023-10-15 00:00:10,042 EPOCH 9 done: loss 0.0088 - lr: 0.000003 2023-10-15 00:00:17,518 DEV : loss 0.36112189292907715 - f1-score (micro avg) 0.6484 2023-10-15 00:00:17,562 ---------------------------------------------------------------------------------------------------- 2023-10-15 00:00:34,052 epoch 10 - iter 361/3617 - loss 0.00393861 - time (sec): 16.49 - samples/sec: 2343.48 - lr: 0.000003 - momentum: 0.000000 2023-10-15 00:00:50,531 epoch 10 - iter 722/3617 - loss 0.00545373 - time (sec): 32.97 - samples/sec: 2330.96 - lr: 0.000003 - momentum: 0.000000 2023-10-15 00:01:06,889 epoch 10 - iter 1083/3617 - loss 0.00589789 - time (sec): 49.32 - samples/sec: 2303.79 - lr: 0.000002 - momentum: 0.000000 2023-10-15 00:01:23,187 epoch 10 - iter 1444/3617 - loss 0.00530915 - time (sec): 65.62 - samples/sec: 2326.78 - lr: 0.000002 - momentum: 0.000000 2023-10-15 00:01:39,466 epoch 10 - iter 1805/3617 - loss 0.00552918 - time (sec): 81.90 - samples/sec: 2319.90 - lr: 0.000002 - momentum: 0.000000 2023-10-15 00:01:55,596 epoch 10 - iter 2166/3617 - loss 0.00498351 - time (sec): 98.03 - samples/sec: 2313.78 - lr: 0.000001 - momentum: 0.000000 2023-10-15 00:02:13,164 epoch 10 - iter 2527/3617 - loss 0.00454256 - time (sec): 115.60 - samples/sec: 2302.96 - lr: 0.000001 - momentum: 0.000000 2023-10-15 00:02:29,564 epoch 10 - iter 2888/3617 - loss 0.00484062 - time (sec): 132.00 - samples/sec: 2304.49 - lr: 0.000001 - momentum: 0.000000 2023-10-15 00:02:45,783 epoch 10 - iter 3249/3617 - loss 0.00485285 - time (sec): 148.22 - samples/sec: 2305.05 - lr: 0.000000 - momentum: 0.000000 2023-10-15 00:03:02,013 epoch 10 - iter 3610/3617 - loss 0.00507728 - time (sec): 164.45 - samples/sec: 2304.88 - lr: 0.000000 - momentum: 0.000000 2023-10-15 00:03:02,337 ---------------------------------------------------------------------------------------------------- 2023-10-15 00:03:02,338 EPOCH 10 done: loss 0.0051 - lr: 0.000000 2023-10-15 00:03:09,963 DEV : loss 0.4055171608924866 - f1-score (micro avg) 0.651 2023-10-15 00:03:10,499 ---------------------------------------------------------------------------------------------------- 2023-10-15 00:03:10,500 Loading model from best epoch ... 2023-10-15 00:03:12,088 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org 2023-10-15 00:03:19,342 Results: - F-score (micro) 0.6471 - F-score (macro) 0.4397 - Accuracy 0.4897 By class: precision recall f1-score support loc 0.6222 0.8054 0.7021 591 pers 0.5701 0.6723 0.6170 357 org 0.0000 0.0000 0.0000 79 micro avg 0.6037 0.6972 0.6471 1027 macro avg 0.3974 0.4926 0.4397 1027 weighted avg 0.5562 0.6972 0.6185 1027 2023-10-15 00:03:19,342 ----------------------------------------------------------------------------------------------------