stefan-it's picture
Upload folder using huggingface_hub
4516356
2023-10-14 23:34:14,669 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,669 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Train: 14465 sentences
2023-10-14 23:34:14,670 (train_with_dev=False, train_with_test=False)
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Training Params:
2023-10-14 23:34:14,670 - learning_rate: "3e-05"
2023-10-14 23:34:14,670 - mini_batch_size: "4"
2023-10-14 23:34:14,670 - max_epochs: "10"
2023-10-14 23:34:14,670 - shuffle: "True"
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Plugins:
2023-10-14 23:34:14,670 - LinearScheduler | warmup_fraction: '0.1'
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Final evaluation on model from best epoch (best-model.pt)
2023-10-14 23:34:14,670 - metric: "('micro avg', 'f1-score')"
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Computation:
2023-10-14 23:34:14,670 - compute on device: cuda:0
2023-10-14 23:34:14,670 - embedding storage: none
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:14,670 ----------------------------------------------------------------------------------------------------
2023-10-14 23:34:31,234 epoch 1 - iter 361/3617 - loss 1.36163379 - time (sec): 16.56 - samples/sec: 2286.03 - lr: 0.000003 - momentum: 0.000000
2023-10-14 23:34:47,495 epoch 1 - iter 722/3617 - loss 0.78535193 - time (sec): 32.82 - samples/sec: 2288.96 - lr: 0.000006 - momentum: 0.000000
2023-10-14 23:35:06,592 epoch 1 - iter 1083/3617 - loss 0.57914316 - time (sec): 51.92 - samples/sec: 2183.91 - lr: 0.000009 - momentum: 0.000000
2023-10-14 23:35:23,708 epoch 1 - iter 1444/3617 - loss 0.47133260 - time (sec): 69.04 - samples/sec: 2183.38 - lr: 0.000012 - momentum: 0.000000
2023-10-14 23:35:40,204 epoch 1 - iter 1805/3617 - loss 0.40334999 - time (sec): 85.53 - samples/sec: 2212.25 - lr: 0.000015 - momentum: 0.000000
2023-10-14 23:35:56,791 epoch 1 - iter 2166/3617 - loss 0.35690213 - time (sec): 102.12 - samples/sec: 2222.98 - lr: 0.000018 - momentum: 0.000000
2023-10-14 23:36:13,063 epoch 1 - iter 2527/3617 - loss 0.32256725 - time (sec): 118.39 - samples/sec: 2230.09 - lr: 0.000021 - momentum: 0.000000
2023-10-14 23:36:31,390 epoch 1 - iter 2888/3617 - loss 0.29552514 - time (sec): 136.72 - samples/sec: 2208.50 - lr: 0.000024 - momentum: 0.000000
2023-10-14 23:36:48,635 epoch 1 - iter 3249/3617 - loss 0.27457449 - time (sec): 153.96 - samples/sec: 2214.58 - lr: 0.000027 - momentum: 0.000000
2023-10-14 23:37:07,987 epoch 1 - iter 3610/3617 - loss 0.25799655 - time (sec): 173.32 - samples/sec: 2188.26 - lr: 0.000030 - momentum: 0.000000
2023-10-14 23:37:08,344 ----------------------------------------------------------------------------------------------------
2023-10-14 23:37:08,344 EPOCH 1 done: loss 0.2577 - lr: 0.000030
2023-10-14 23:37:13,548 DEV : loss 0.1281796246767044 - f1-score (micro avg) 0.5899
2023-10-14 23:37:13,589 saving best model
2023-10-14 23:37:14,042 ----------------------------------------------------------------------------------------------------
2023-10-14 23:37:31,572 epoch 2 - iter 361/3617 - loss 0.10519861 - time (sec): 17.53 - samples/sec: 2217.47 - lr: 0.000030 - momentum: 0.000000
2023-10-14 23:37:50,569 epoch 2 - iter 722/3617 - loss 0.10049883 - time (sec): 36.53 - samples/sec: 2096.09 - lr: 0.000029 - momentum: 0.000000
2023-10-14 23:38:07,284 epoch 2 - iter 1083/3617 - loss 0.10110167 - time (sec): 53.24 - samples/sec: 2145.24 - lr: 0.000029 - momentum: 0.000000
2023-10-14 23:38:23,594 epoch 2 - iter 1444/3617 - loss 0.09863041 - time (sec): 69.55 - samples/sec: 2200.65 - lr: 0.000029 - momentum: 0.000000
2023-10-14 23:38:41,303 epoch 2 - iter 1805/3617 - loss 0.10015583 - time (sec): 87.26 - samples/sec: 2188.65 - lr: 0.000028 - momentum: 0.000000
2023-10-14 23:38:59,829 epoch 2 - iter 2166/3617 - loss 0.09892596 - time (sec): 105.79 - samples/sec: 2178.56 - lr: 0.000028 - momentum: 0.000000
2023-10-14 23:39:16,761 epoch 2 - iter 2527/3617 - loss 0.10000026 - time (sec): 122.72 - samples/sec: 2180.19 - lr: 0.000028 - momentum: 0.000000
2023-10-14 23:39:34,633 epoch 2 - iter 2888/3617 - loss 0.09831535 - time (sec): 140.59 - samples/sec: 2164.63 - lr: 0.000027 - momentum: 0.000000
2023-10-14 23:39:51,700 epoch 2 - iter 3249/3617 - loss 0.09854326 - time (sec): 157.66 - samples/sec: 2170.25 - lr: 0.000027 - momentum: 0.000000
2023-10-14 23:40:11,440 epoch 2 - iter 3610/3617 - loss 0.09811914 - time (sec): 177.40 - samples/sec: 2138.64 - lr: 0.000027 - momentum: 0.000000
2023-10-14 23:40:11,809 ----------------------------------------------------------------------------------------------------
2023-10-14 23:40:11,809 EPOCH 2 done: loss 0.0981 - lr: 0.000027
2023-10-14 23:40:18,793 DEV : loss 0.1390405148267746 - f1-score (micro avg) 0.6624
2023-10-14 23:40:18,825 saving best model
2023-10-14 23:40:19,506 ----------------------------------------------------------------------------------------------------
2023-10-14 23:40:36,393 epoch 3 - iter 361/3617 - loss 0.06299067 - time (sec): 16.88 - samples/sec: 2243.14 - lr: 0.000026 - momentum: 0.000000
2023-10-14 23:40:52,924 epoch 3 - iter 722/3617 - loss 0.06908990 - time (sec): 33.42 - samples/sec: 2302.91 - lr: 0.000026 - momentum: 0.000000
2023-10-14 23:41:09,158 epoch 3 - iter 1083/3617 - loss 0.06892332 - time (sec): 49.65 - samples/sec: 2289.47 - lr: 0.000026 - momentum: 0.000000
2023-10-14 23:41:25,608 epoch 3 - iter 1444/3617 - loss 0.07129677 - time (sec): 66.10 - samples/sec: 2289.03 - lr: 0.000025 - momentum: 0.000000
2023-10-14 23:41:41,972 epoch 3 - iter 1805/3617 - loss 0.07081905 - time (sec): 82.46 - samples/sec: 2275.35 - lr: 0.000025 - momentum: 0.000000
2023-10-14 23:41:58,328 epoch 3 - iter 2166/3617 - loss 0.07078356 - time (sec): 98.82 - samples/sec: 2286.31 - lr: 0.000025 - momentum: 0.000000
2023-10-14 23:42:14,621 epoch 3 - iter 2527/3617 - loss 0.07071235 - time (sec): 115.11 - samples/sec: 2294.00 - lr: 0.000024 - momentum: 0.000000
2023-10-14 23:42:30,794 epoch 3 - iter 2888/3617 - loss 0.07182890 - time (sec): 131.28 - samples/sec: 2301.27 - lr: 0.000024 - momentum: 0.000000
2023-10-14 23:42:47,165 epoch 3 - iter 3249/3617 - loss 0.07170442 - time (sec): 147.66 - samples/sec: 2299.57 - lr: 0.000024 - momentum: 0.000000
2023-10-14 23:43:03,880 epoch 3 - iter 3610/3617 - loss 0.07109717 - time (sec): 164.37 - samples/sec: 2308.22 - lr: 0.000023 - momentum: 0.000000
2023-10-14 23:43:04,189 ----------------------------------------------------------------------------------------------------
2023-10-14 23:43:04,189 EPOCH 3 done: loss 0.0713 - lr: 0.000023
2023-10-14 23:43:10,654 DEV : loss 0.19752231240272522 - f1-score (micro avg) 0.6235
2023-10-14 23:43:10,698 ----------------------------------------------------------------------------------------------------
2023-10-14 23:43:27,930 epoch 4 - iter 361/3617 - loss 0.04436168 - time (sec): 17.23 - samples/sec: 2202.46 - lr: 0.000023 - momentum: 0.000000
2023-10-14 23:43:44,435 epoch 4 - iter 722/3617 - loss 0.04585262 - time (sec): 33.73 - samples/sec: 2218.87 - lr: 0.000023 - momentum: 0.000000
2023-10-14 23:44:01,096 epoch 4 - iter 1083/3617 - loss 0.04921851 - time (sec): 50.40 - samples/sec: 2279.70 - lr: 0.000022 - momentum: 0.000000
2023-10-14 23:44:17,549 epoch 4 - iter 1444/3617 - loss 0.04934984 - time (sec): 66.85 - samples/sec: 2280.98 - lr: 0.000022 - momentum: 0.000000
2023-10-14 23:44:34,110 epoch 4 - iter 1805/3617 - loss 0.05175654 - time (sec): 83.41 - samples/sec: 2279.99 - lr: 0.000022 - momentum: 0.000000
2023-10-14 23:44:50,541 epoch 4 - iter 2166/3617 - loss 0.05020951 - time (sec): 99.84 - samples/sec: 2282.77 - lr: 0.000021 - momentum: 0.000000
2023-10-14 23:45:07,182 epoch 4 - iter 2527/3617 - loss 0.05084194 - time (sec): 116.48 - samples/sec: 2283.21 - lr: 0.000021 - momentum: 0.000000
2023-10-14 23:45:23,435 epoch 4 - iter 2888/3617 - loss 0.05193754 - time (sec): 132.74 - samples/sec: 2285.41 - lr: 0.000021 - momentum: 0.000000
2023-10-14 23:45:39,602 epoch 4 - iter 3249/3617 - loss 0.05165414 - time (sec): 148.90 - samples/sec: 2291.14 - lr: 0.000020 - momentum: 0.000000
2023-10-14 23:45:55,738 epoch 4 - iter 3610/3617 - loss 0.05169628 - time (sec): 165.04 - samples/sec: 2298.21 - lr: 0.000020 - momentum: 0.000000
2023-10-14 23:45:56,041 ----------------------------------------------------------------------------------------------------
2023-10-14 23:45:56,041 EPOCH 4 done: loss 0.0517 - lr: 0.000020
2023-10-14 23:46:02,662 DEV : loss 0.2679402828216553 - f1-score (micro avg) 0.6366
2023-10-14 23:46:02,698 ----------------------------------------------------------------------------------------------------
2023-10-14 23:46:19,607 epoch 5 - iter 361/3617 - loss 0.04298391 - time (sec): 16.91 - samples/sec: 2203.83 - lr: 0.000020 - momentum: 0.000000
2023-10-14 23:46:36,400 epoch 5 - iter 722/3617 - loss 0.03826667 - time (sec): 33.70 - samples/sec: 2257.95 - lr: 0.000019 - momentum: 0.000000
2023-10-14 23:46:52,804 epoch 5 - iter 1083/3617 - loss 0.03941655 - time (sec): 50.10 - samples/sec: 2289.34 - lr: 0.000019 - momentum: 0.000000
2023-10-14 23:47:09,226 epoch 5 - iter 1444/3617 - loss 0.03859460 - time (sec): 66.53 - samples/sec: 2288.55 - lr: 0.000019 - momentum: 0.000000
2023-10-14 23:47:25,683 epoch 5 - iter 1805/3617 - loss 0.03843288 - time (sec): 82.98 - samples/sec: 2296.59 - lr: 0.000018 - momentum: 0.000000
2023-10-14 23:47:41,999 epoch 5 - iter 2166/3617 - loss 0.03940779 - time (sec): 99.30 - samples/sec: 2306.54 - lr: 0.000018 - momentum: 0.000000
2023-10-14 23:47:58,428 epoch 5 - iter 2527/3617 - loss 0.03985879 - time (sec): 115.73 - samples/sec: 2320.92 - lr: 0.000018 - momentum: 0.000000
2023-10-14 23:48:14,698 epoch 5 - iter 2888/3617 - loss 0.03859164 - time (sec): 132.00 - samples/sec: 2318.43 - lr: 0.000017 - momentum: 0.000000
2023-10-14 23:48:31,109 epoch 5 - iter 3249/3617 - loss 0.03871626 - time (sec): 148.41 - samples/sec: 2309.97 - lr: 0.000017 - momentum: 0.000000
2023-10-14 23:48:47,292 epoch 5 - iter 3610/3617 - loss 0.03883470 - time (sec): 164.59 - samples/sec: 2304.32 - lr: 0.000017 - momentum: 0.000000
2023-10-14 23:48:47,598 ----------------------------------------------------------------------------------------------------
2023-10-14 23:48:47,598 EPOCH 5 done: loss 0.0388 - lr: 0.000017
2023-10-14 23:48:53,945 DEV : loss 0.2992906868457794 - f1-score (micro avg) 0.6196
2023-10-14 23:48:53,975 ----------------------------------------------------------------------------------------------------
2023-10-14 23:49:10,350 epoch 6 - iter 361/3617 - loss 0.02858051 - time (sec): 16.37 - samples/sec: 2293.30 - lr: 0.000016 - momentum: 0.000000
2023-10-14 23:49:26,688 epoch 6 - iter 722/3617 - loss 0.03008292 - time (sec): 32.71 - samples/sec: 2325.10 - lr: 0.000016 - momentum: 0.000000
2023-10-14 23:49:43,019 epoch 6 - iter 1083/3617 - loss 0.03008366 - time (sec): 49.04 - samples/sec: 2324.89 - lr: 0.000016 - momentum: 0.000000
2023-10-14 23:49:59,394 epoch 6 - iter 1444/3617 - loss 0.02765207 - time (sec): 65.42 - samples/sec: 2337.01 - lr: 0.000015 - momentum: 0.000000
2023-10-14 23:50:15,811 epoch 6 - iter 1805/3617 - loss 0.02577909 - time (sec): 81.83 - samples/sec: 2323.08 - lr: 0.000015 - momentum: 0.000000
2023-10-14 23:50:32,251 epoch 6 - iter 2166/3617 - loss 0.02580397 - time (sec): 98.27 - samples/sec: 2317.84 - lr: 0.000015 - momentum: 0.000000
2023-10-14 23:50:49,652 epoch 6 - iter 2527/3617 - loss 0.02703633 - time (sec): 115.67 - samples/sec: 2292.76 - lr: 0.000014 - momentum: 0.000000
2023-10-14 23:51:05,959 epoch 6 - iter 2888/3617 - loss 0.02642517 - time (sec): 131.98 - samples/sec: 2295.62 - lr: 0.000014 - momentum: 0.000000
2023-10-14 23:51:22,251 epoch 6 - iter 3249/3617 - loss 0.02767332 - time (sec): 148.27 - samples/sec: 2302.85 - lr: 0.000014 - momentum: 0.000000
2023-10-14 23:51:38,863 epoch 6 - iter 3610/3617 - loss 0.02746834 - time (sec): 164.89 - samples/sec: 2299.21 - lr: 0.000013 - momentum: 0.000000
2023-10-14 23:51:39,164 ----------------------------------------------------------------------------------------------------
2023-10-14 23:51:39,165 EPOCH 6 done: loss 0.0275 - lr: 0.000013
2023-10-14 23:51:44,769 DEV : loss 0.31786689162254333 - f1-score (micro avg) 0.6355
2023-10-14 23:51:44,815 ----------------------------------------------------------------------------------------------------
2023-10-14 23:52:01,511 epoch 7 - iter 361/3617 - loss 0.01641665 - time (sec): 16.69 - samples/sec: 2222.47 - lr: 0.000013 - momentum: 0.000000
2023-10-14 23:52:17,844 epoch 7 - iter 722/3617 - loss 0.01652455 - time (sec): 33.03 - samples/sec: 2229.06 - lr: 0.000013 - momentum: 0.000000
2023-10-14 23:52:34,199 epoch 7 - iter 1083/3617 - loss 0.01691783 - time (sec): 49.38 - samples/sec: 2267.82 - lr: 0.000012 - momentum: 0.000000
2023-10-14 23:52:50,534 epoch 7 - iter 1444/3617 - loss 0.01601712 - time (sec): 65.72 - samples/sec: 2269.87 - lr: 0.000012 - momentum: 0.000000
2023-10-14 23:53:06,889 epoch 7 - iter 1805/3617 - loss 0.01694897 - time (sec): 82.07 - samples/sec: 2282.71 - lr: 0.000012 - momentum: 0.000000
2023-10-14 23:53:22,796 epoch 7 - iter 2166/3617 - loss 0.01691405 - time (sec): 97.98 - samples/sec: 2303.14 - lr: 0.000011 - momentum: 0.000000
2023-10-14 23:53:38,736 epoch 7 - iter 2527/3617 - loss 0.01667359 - time (sec): 113.92 - samples/sec: 2325.32 - lr: 0.000011 - momentum: 0.000000
2023-10-14 23:53:54,461 epoch 7 - iter 2888/3617 - loss 0.01633718 - time (sec): 129.64 - samples/sec: 2328.61 - lr: 0.000011 - momentum: 0.000000
2023-10-14 23:54:10,751 epoch 7 - iter 3249/3617 - loss 0.01689256 - time (sec): 145.93 - samples/sec: 2345.45 - lr: 0.000010 - momentum: 0.000000
2023-10-14 23:54:29,301 epoch 7 - iter 3610/3617 - loss 0.01707300 - time (sec): 164.48 - samples/sec: 2306.07 - lr: 0.000010 - momentum: 0.000000
2023-10-14 23:54:29,666 ----------------------------------------------------------------------------------------------------
2023-10-14 23:54:29,666 EPOCH 7 done: loss 0.0171 - lr: 0.000010
2023-10-14 23:54:36,234 DEV : loss 0.34024110436439514 - f1-score (micro avg) 0.6445
2023-10-14 23:54:36,268 ----------------------------------------------------------------------------------------------------
2023-10-14 23:54:52,959 epoch 8 - iter 361/3617 - loss 0.00929665 - time (sec): 16.69 - samples/sec: 2233.85 - lr: 0.000010 - momentum: 0.000000
2023-10-14 23:55:09,344 epoch 8 - iter 722/3617 - loss 0.01095348 - time (sec): 33.07 - samples/sec: 2289.01 - lr: 0.000009 - momentum: 0.000000
2023-10-14 23:55:25,644 epoch 8 - iter 1083/3617 - loss 0.01097502 - time (sec): 49.37 - samples/sec: 2284.68 - lr: 0.000009 - momentum: 0.000000
2023-10-14 23:55:41,915 epoch 8 - iter 1444/3617 - loss 0.01199052 - time (sec): 65.65 - samples/sec: 2316.06 - lr: 0.000009 - momentum: 0.000000
2023-10-14 23:55:58,052 epoch 8 - iter 1805/3617 - loss 0.01155861 - time (sec): 81.78 - samples/sec: 2310.35 - lr: 0.000008 - momentum: 0.000000
2023-10-14 23:56:14,251 epoch 8 - iter 2166/3617 - loss 0.01161260 - time (sec): 97.98 - samples/sec: 2313.58 - lr: 0.000008 - momentum: 0.000000
2023-10-14 23:56:30,608 epoch 8 - iter 2527/3617 - loss 0.01230179 - time (sec): 114.34 - samples/sec: 2311.46 - lr: 0.000008 - momentum: 0.000000
2023-10-14 23:56:46,856 epoch 8 - iter 2888/3617 - loss 0.01231660 - time (sec): 130.59 - samples/sec: 2321.23 - lr: 0.000007 - momentum: 0.000000
2023-10-14 23:57:03,177 epoch 8 - iter 3249/3617 - loss 0.01206507 - time (sec): 146.91 - samples/sec: 2320.26 - lr: 0.000007 - momentum: 0.000000
2023-10-14 23:57:19,556 epoch 8 - iter 3610/3617 - loss 0.01178827 - time (sec): 163.29 - samples/sec: 2321.42 - lr: 0.000007 - momentum: 0.000000
2023-10-14 23:57:19,879 ----------------------------------------------------------------------------------------------------
2023-10-14 23:57:19,879 EPOCH 8 done: loss 0.0118 - lr: 0.000007
2023-10-14 23:57:26,350 DEV : loss 0.36208638548851013 - f1-score (micro avg) 0.6341
2023-10-14 23:57:26,383 ----------------------------------------------------------------------------------------------------
2023-10-14 23:57:42,883 epoch 9 - iter 361/3617 - loss 0.01142158 - time (sec): 16.50 - samples/sec: 2295.97 - lr: 0.000006 - momentum: 0.000000
2023-10-14 23:57:59,357 epoch 9 - iter 722/3617 - loss 0.00738770 - time (sec): 32.97 - samples/sec: 2315.11 - lr: 0.000006 - momentum: 0.000000
2023-10-14 23:58:15,686 epoch 9 - iter 1083/3617 - loss 0.00754335 - time (sec): 49.30 - samples/sec: 2318.80 - lr: 0.000006 - momentum: 0.000000
2023-10-14 23:58:31,888 epoch 9 - iter 1444/3617 - loss 0.00919711 - time (sec): 65.50 - samples/sec: 2311.16 - lr: 0.000005 - momentum: 0.000000
2023-10-14 23:58:48,291 epoch 9 - iter 1805/3617 - loss 0.00809847 - time (sec): 81.91 - samples/sec: 2319.40 - lr: 0.000005 - momentum: 0.000000
2023-10-14 23:59:04,676 epoch 9 - iter 2166/3617 - loss 0.00725574 - time (sec): 98.29 - samples/sec: 2317.26 - lr: 0.000005 - momentum: 0.000000
2023-10-14 23:59:20,803 epoch 9 - iter 2527/3617 - loss 0.00767874 - time (sec): 114.42 - samples/sec: 2317.79 - lr: 0.000004 - momentum: 0.000000
2023-10-14 23:59:37,053 epoch 9 - iter 2888/3617 - loss 0.00829904 - time (sec): 130.67 - samples/sec: 2325.30 - lr: 0.000004 - momentum: 0.000000
2023-10-14 23:59:53,365 epoch 9 - iter 3249/3617 - loss 0.00828150 - time (sec): 146.98 - samples/sec: 2324.96 - lr: 0.000004 - momentum: 0.000000
2023-10-15 00:00:09,734 epoch 9 - iter 3610/3617 - loss 0.00879170 - time (sec): 163.35 - samples/sec: 2322.22 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:00:10,042 ----------------------------------------------------------------------------------------------------
2023-10-15 00:00:10,042 EPOCH 9 done: loss 0.0088 - lr: 0.000003
2023-10-15 00:00:17,518 DEV : loss 0.36112189292907715 - f1-score (micro avg) 0.6484
2023-10-15 00:00:17,562 ----------------------------------------------------------------------------------------------------
2023-10-15 00:00:34,052 epoch 10 - iter 361/3617 - loss 0.00393861 - time (sec): 16.49 - samples/sec: 2343.48 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:00:50,531 epoch 10 - iter 722/3617 - loss 0.00545373 - time (sec): 32.97 - samples/sec: 2330.96 - lr: 0.000003 - momentum: 0.000000
2023-10-15 00:01:06,889 epoch 10 - iter 1083/3617 - loss 0.00589789 - time (sec): 49.32 - samples/sec: 2303.79 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:01:23,187 epoch 10 - iter 1444/3617 - loss 0.00530915 - time (sec): 65.62 - samples/sec: 2326.78 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:01:39,466 epoch 10 - iter 1805/3617 - loss 0.00552918 - time (sec): 81.90 - samples/sec: 2319.90 - lr: 0.000002 - momentum: 0.000000
2023-10-15 00:01:55,596 epoch 10 - iter 2166/3617 - loss 0.00498351 - time (sec): 98.03 - samples/sec: 2313.78 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:02:13,164 epoch 10 - iter 2527/3617 - loss 0.00454256 - time (sec): 115.60 - samples/sec: 2302.96 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:02:29,564 epoch 10 - iter 2888/3617 - loss 0.00484062 - time (sec): 132.00 - samples/sec: 2304.49 - lr: 0.000001 - momentum: 0.000000
2023-10-15 00:02:45,783 epoch 10 - iter 3249/3617 - loss 0.00485285 - time (sec): 148.22 - samples/sec: 2305.05 - lr: 0.000000 - momentum: 0.000000
2023-10-15 00:03:02,013 epoch 10 - iter 3610/3617 - loss 0.00507728 - time (sec): 164.45 - samples/sec: 2304.88 - lr: 0.000000 - momentum: 0.000000
2023-10-15 00:03:02,337 ----------------------------------------------------------------------------------------------------
2023-10-15 00:03:02,338 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-15 00:03:09,963 DEV : loss 0.4055171608924866 - f1-score (micro avg) 0.651
2023-10-15 00:03:10,499 ----------------------------------------------------------------------------------------------------
2023-10-15 00:03:10,500 Loading model from best epoch ...
2023-10-15 00:03:12,088 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-15 00:03:19,342
Results:
- F-score (micro) 0.6471
- F-score (macro) 0.4397
- Accuracy 0.4897
By class:
precision recall f1-score support
loc 0.6222 0.8054 0.7021 591
pers 0.5701 0.6723 0.6170 357
org 0.0000 0.0000 0.0000 79
micro avg 0.6037 0.6972 0.6471 1027
macro avg 0.3974 0.4926 0.4397 1027
weighted avg 0.5562 0.6972 0.6185 1027
2023-10-15 00:03:19,342 ----------------------------------------------------------------------------------------------------