2023-10-17 16:47:12,055 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,057 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=21, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 16:47:12,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,057 MultiCorpus: 3575 train + 1235 dev + 1266 test sentences - NER_HIPE_2022 Corpus: 3575 train + 1235 dev + 1266 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/hipe2020/de/with_doc_seperator 2023-10-17 16:47:12,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,057 Train: 3575 sentences 2023-10-17 16:47:12,057 (train_with_dev=False, train_with_test=False) 2023-10-17 16:47:12,057 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,058 Training Params: 2023-10-17 16:47:12,058 - learning_rate: "3e-05" 2023-10-17 16:47:12,058 - mini_batch_size: "8" 2023-10-17 16:47:12,058 - max_epochs: "10" 2023-10-17 16:47:12,058 - shuffle: "True" 2023-10-17 16:47:12,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,058 Plugins: 2023-10-17 16:47:12,058 - TensorboardLogger 2023-10-17 16:47:12,058 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 16:47:12,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,058 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 16:47:12,058 - metric: "('micro avg', 'f1-score')" 2023-10-17 16:47:12,058 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,059 Computation: 2023-10-17 16:47:12,059 - compute on device: cuda:0 2023-10-17 16:47:12,059 - embedding storage: none 2023-10-17 16:47:12,059 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,059 Model training base path: "hmbench-hipe2020/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 16:47:12,059 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,059 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:12,059 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 16:47:16,186 epoch 1 - iter 44/447 - loss 3.49661378 - time (sec): 4.13 - samples/sec: 1872.19 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:47:20,660 epoch 1 - iter 88/447 - loss 2.61631042 - time (sec): 8.60 - samples/sec: 1956.66 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:47:24,936 epoch 1 - iter 132/447 - loss 1.93352666 - time (sec): 12.88 - samples/sec: 1984.67 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:47:28,939 epoch 1 - iter 176/447 - loss 1.57530896 - time (sec): 16.88 - samples/sec: 2000.58 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:47:33,220 epoch 1 - iter 220/447 - loss 1.34725282 - time (sec): 21.16 - samples/sec: 1996.09 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:47:37,715 epoch 1 - iter 264/447 - loss 1.15393413 - time (sec): 25.65 - samples/sec: 2022.57 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:47:41,873 epoch 1 - iter 308/447 - loss 1.04226944 - time (sec): 29.81 - samples/sec: 2024.30 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:47:45,852 epoch 1 - iter 352/447 - loss 0.95292814 - time (sec): 33.79 - samples/sec: 2021.95 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:47:50,370 epoch 1 - iter 396/447 - loss 0.87278568 - time (sec): 38.31 - samples/sec: 2013.81 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:47:54,446 epoch 1 - iter 440/447 - loss 0.81583916 - time (sec): 42.39 - samples/sec: 2007.91 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:47:55,074 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:47:55,075 EPOCH 1 done: loss 0.8052 - lr: 0.000029 2023-10-17 16:48:01,442 DEV : loss 0.18609091639518738 - f1-score (micro avg) 0.62 2023-10-17 16:48:01,495 saving best model 2023-10-17 16:48:02,032 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:48:06,103 epoch 2 - iter 44/447 - loss 0.17646139 - time (sec): 4.07 - samples/sec: 2094.88 - lr: 0.000030 - momentum: 0.000000 2023-10-17 16:48:10,127 epoch 2 - iter 88/447 - loss 0.17558676 - time (sec): 8.09 - samples/sec: 2086.39 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:48:14,170 epoch 2 - iter 132/447 - loss 0.16683960 - time (sec): 12.14 - samples/sec: 2026.77 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:48:18,156 epoch 2 - iter 176/447 - loss 0.16952362 - time (sec): 16.12 - samples/sec: 1992.29 - lr: 0.000029 - momentum: 0.000000 2023-10-17 16:48:22,496 epoch 2 - iter 220/447 - loss 0.16761727 - time (sec): 20.46 - samples/sec: 2027.06 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:48:26,893 epoch 2 - iter 264/447 - loss 0.17109456 - time (sec): 24.86 - samples/sec: 2038.35 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:48:30,902 epoch 2 - iter 308/447 - loss 0.17214782 - time (sec): 28.87 - samples/sec: 2044.42 - lr: 0.000028 - momentum: 0.000000 2023-10-17 16:48:35,157 epoch 2 - iter 352/447 - loss 0.16542676 - time (sec): 33.12 - samples/sec: 2051.24 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:48:39,571 epoch 2 - iter 396/447 - loss 0.15856207 - time (sec): 37.54 - samples/sec: 2060.57 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:48:43,542 epoch 2 - iter 440/447 - loss 0.15666011 - time (sec): 41.51 - samples/sec: 2055.00 - lr: 0.000027 - momentum: 0.000000 2023-10-17 16:48:44,156 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:48:44,156 EPOCH 2 done: loss 0.1564 - lr: 0.000027 2023-10-17 16:48:55,101 DEV : loss 0.11850441992282867 - f1-score (micro avg) 0.7004 2023-10-17 16:48:55,153 saving best model 2023-10-17 16:48:56,532 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:00,603 epoch 3 - iter 44/447 - loss 0.08474878 - time (sec): 4.07 - samples/sec: 2113.03 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:49:04,693 epoch 3 - iter 88/447 - loss 0.08245773 - time (sec): 8.16 - samples/sec: 2090.39 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:49:09,017 epoch 3 - iter 132/447 - loss 0.08517300 - time (sec): 12.48 - samples/sec: 2086.23 - lr: 0.000026 - momentum: 0.000000 2023-10-17 16:49:13,016 epoch 3 - iter 176/447 - loss 0.08575584 - time (sec): 16.48 - samples/sec: 2062.81 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:49:17,520 epoch 3 - iter 220/447 - loss 0.08751252 - time (sec): 20.98 - samples/sec: 2038.05 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:49:22,077 epoch 3 - iter 264/447 - loss 0.08907774 - time (sec): 25.54 - samples/sec: 2025.88 - lr: 0.000025 - momentum: 0.000000 2023-10-17 16:49:26,176 epoch 3 - iter 308/447 - loss 0.08626226 - time (sec): 29.64 - samples/sec: 2027.47 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:49:30,200 epoch 3 - iter 352/447 - loss 0.08544823 - time (sec): 33.66 - samples/sec: 2036.17 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:49:34,477 epoch 3 - iter 396/447 - loss 0.08472916 - time (sec): 37.94 - samples/sec: 2041.16 - lr: 0.000024 - momentum: 0.000000 2023-10-17 16:49:38,389 epoch 3 - iter 440/447 - loss 0.08337869 - time (sec): 41.85 - samples/sec: 2039.72 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:49:39,011 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:39,011 EPOCH 3 done: loss 0.0831 - lr: 0.000023 2023-10-17 16:49:50,287 DEV : loss 0.16098077595233917 - f1-score (micro avg) 0.7368 2023-10-17 16:49:50,342 saving best model 2023-10-17 16:49:51,747 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:49:56,121 epoch 4 - iter 44/447 - loss 0.06080484 - time (sec): 4.37 - samples/sec: 2053.56 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:50:00,102 epoch 4 - iter 88/447 - loss 0.05047837 - time (sec): 8.35 - samples/sec: 2073.72 - lr: 0.000023 - momentum: 0.000000 2023-10-17 16:50:04,214 epoch 4 - iter 132/447 - loss 0.04822910 - time (sec): 12.46 - samples/sec: 2078.43 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:50:08,227 epoch 4 - iter 176/447 - loss 0.05146065 - time (sec): 16.48 - samples/sec: 2067.12 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:50:12,222 epoch 4 - iter 220/447 - loss 0.05457535 - time (sec): 20.47 - samples/sec: 2081.48 - lr: 0.000022 - momentum: 0.000000 2023-10-17 16:50:16,289 epoch 4 - iter 264/447 - loss 0.05684864 - time (sec): 24.54 - samples/sec: 2089.74 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:50:20,515 epoch 4 - iter 308/447 - loss 0.05629000 - time (sec): 28.76 - samples/sec: 2067.32 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:50:24,525 epoch 4 - iter 352/447 - loss 0.05515985 - time (sec): 32.77 - samples/sec: 2058.16 - lr: 0.000021 - momentum: 0.000000 2023-10-17 16:50:29,074 epoch 4 - iter 396/447 - loss 0.05406735 - time (sec): 37.32 - samples/sec: 2059.09 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:50:33,156 epoch 4 - iter 440/447 - loss 0.05404876 - time (sec): 41.40 - samples/sec: 2055.53 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:50:33,828 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:50:33,829 EPOCH 4 done: loss 0.0537 - lr: 0.000020 2023-10-17 16:50:44,774 DEV : loss 0.17118440568447113 - f1-score (micro avg) 0.7738 2023-10-17 16:50:44,823 saving best model 2023-10-17 16:50:46,179 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:50:50,043 epoch 5 - iter 44/447 - loss 0.02823819 - time (sec): 3.86 - samples/sec: 2125.42 - lr: 0.000020 - momentum: 0.000000 2023-10-17 16:50:54,046 epoch 5 - iter 88/447 - loss 0.02974272 - time (sec): 7.86 - samples/sec: 2178.10 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:50:58,481 epoch 5 - iter 132/447 - loss 0.03274366 - time (sec): 12.30 - samples/sec: 2173.97 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:51:02,421 epoch 5 - iter 176/447 - loss 0.03126456 - time (sec): 16.24 - samples/sec: 2120.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 16:51:06,701 epoch 5 - iter 220/447 - loss 0.02862125 - time (sec): 20.52 - samples/sec: 2123.83 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:51:10,991 epoch 5 - iter 264/447 - loss 0.02862398 - time (sec): 24.81 - samples/sec: 2103.31 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:51:15,112 epoch 5 - iter 308/447 - loss 0.02844421 - time (sec): 28.93 - samples/sec: 2083.75 - lr: 0.000018 - momentum: 0.000000 2023-10-17 16:51:19,138 epoch 5 - iter 352/447 - loss 0.02956228 - time (sec): 32.96 - samples/sec: 2071.76 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:51:23,488 epoch 5 - iter 396/447 - loss 0.03284724 - time (sec): 37.31 - samples/sec: 2064.00 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:51:27,485 epoch 5 - iter 440/447 - loss 0.03180610 - time (sec): 41.30 - samples/sec: 2062.28 - lr: 0.000017 - momentum: 0.000000 2023-10-17 16:51:28,153 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:28,154 EPOCH 5 done: loss 0.0315 - lr: 0.000017 2023-10-17 16:51:39,205 DEV : loss 0.17024928331375122 - f1-score (micro avg) 0.7777 2023-10-17 16:51:39,267 saving best model 2023-10-17 16:51:40,718 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:51:45,030 epoch 6 - iter 44/447 - loss 0.02203100 - time (sec): 4.31 - samples/sec: 2041.04 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:51:49,781 epoch 6 - iter 88/447 - loss 0.01920241 - time (sec): 9.06 - samples/sec: 2020.90 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:51:54,204 epoch 6 - iter 132/447 - loss 0.02226912 - time (sec): 13.48 - samples/sec: 1979.81 - lr: 0.000016 - momentum: 0.000000 2023-10-17 16:51:58,231 epoch 6 - iter 176/447 - loss 0.02378920 - time (sec): 17.51 - samples/sec: 1959.27 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:52:02,125 epoch 6 - iter 220/447 - loss 0.02333195 - time (sec): 21.40 - samples/sec: 1932.28 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:52:06,311 epoch 6 - iter 264/447 - loss 0.02310408 - time (sec): 25.59 - samples/sec: 1946.29 - lr: 0.000015 - momentum: 0.000000 2023-10-17 16:52:10,568 epoch 6 - iter 308/447 - loss 0.02234226 - time (sec): 29.85 - samples/sec: 1981.40 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:52:14,740 epoch 6 - iter 352/447 - loss 0.02289841 - time (sec): 34.02 - samples/sec: 1984.53 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:52:19,543 epoch 6 - iter 396/447 - loss 0.02189250 - time (sec): 38.82 - samples/sec: 1986.59 - lr: 0.000014 - momentum: 0.000000 2023-10-17 16:52:23,522 epoch 6 - iter 440/447 - loss 0.02187698 - time (sec): 42.80 - samples/sec: 1997.83 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:52:24,166 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:24,167 EPOCH 6 done: loss 0.0230 - lr: 0.000013 2023-10-17 16:52:34,982 DEV : loss 0.19983802735805511 - f1-score (micro avg) 0.803 2023-10-17 16:52:35,059 saving best model 2023-10-17 16:52:36,497 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:52:40,921 epoch 7 - iter 44/447 - loss 0.00850187 - time (sec): 4.42 - samples/sec: 2092.53 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:52:45,097 epoch 7 - iter 88/447 - loss 0.01184907 - time (sec): 8.60 - samples/sec: 2003.34 - lr: 0.000013 - momentum: 0.000000 2023-10-17 16:52:49,559 epoch 7 - iter 132/447 - loss 0.01008429 - time (sec): 13.06 - samples/sec: 1983.15 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:52:53,577 epoch 7 - iter 176/447 - loss 0.01116324 - time (sec): 17.08 - samples/sec: 1978.81 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:52:57,658 epoch 7 - iter 220/447 - loss 0.01311415 - time (sec): 21.16 - samples/sec: 1975.15 - lr: 0.000012 - momentum: 0.000000 2023-10-17 16:53:01,902 epoch 7 - iter 264/447 - loss 0.01472328 - time (sec): 25.40 - samples/sec: 1967.29 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:53:06,100 epoch 7 - iter 308/447 - loss 0.01385064 - time (sec): 29.60 - samples/sec: 1977.81 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:53:10,128 epoch 7 - iter 352/447 - loss 0.01331558 - time (sec): 33.63 - samples/sec: 1999.23 - lr: 0.000011 - momentum: 0.000000 2023-10-17 16:53:14,178 epoch 7 - iter 396/447 - loss 0.01397777 - time (sec): 37.68 - samples/sec: 2010.39 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:53:18,463 epoch 7 - iter 440/447 - loss 0.01368698 - time (sec): 41.96 - samples/sec: 2027.17 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:53:19,188 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:53:19,188 EPOCH 7 done: loss 0.0135 - lr: 0.000010 2023-10-17 16:53:30,279 DEV : loss 0.20469656586647034 - f1-score (micro avg) 0.7921 2023-10-17 16:53:30,335 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:53:34,480 epoch 8 - iter 44/447 - loss 0.00313666 - time (sec): 4.14 - samples/sec: 2080.39 - lr: 0.000010 - momentum: 0.000000 2023-10-17 16:53:38,581 epoch 8 - iter 88/447 - loss 0.00502648 - time (sec): 8.24 - samples/sec: 2049.56 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:53:42,753 epoch 8 - iter 132/447 - loss 0.00740875 - time (sec): 12.42 - samples/sec: 2062.49 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:53:47,062 epoch 8 - iter 176/447 - loss 0.00859745 - time (sec): 16.73 - samples/sec: 2020.65 - lr: 0.000009 - momentum: 0.000000 2023-10-17 16:53:51,256 epoch 8 - iter 220/447 - loss 0.00798549 - time (sec): 20.92 - samples/sec: 2006.40 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:53:55,572 epoch 8 - iter 264/447 - loss 0.00781959 - time (sec): 25.23 - samples/sec: 2015.40 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:53:59,928 epoch 8 - iter 308/447 - loss 0.00794785 - time (sec): 29.59 - samples/sec: 2008.36 - lr: 0.000008 - momentum: 0.000000 2023-10-17 16:54:04,317 epoch 8 - iter 352/447 - loss 0.00872736 - time (sec): 33.98 - samples/sec: 1991.41 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:54:08,671 epoch 8 - iter 396/447 - loss 0.00928899 - time (sec): 38.33 - samples/sec: 1991.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:54:13,050 epoch 8 - iter 440/447 - loss 0.00882367 - time (sec): 42.71 - samples/sec: 1995.01 - lr: 0.000007 - momentum: 0.000000 2023-10-17 16:54:13,689 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:13,689 EPOCH 8 done: loss 0.0088 - lr: 0.000007 2023-10-17 16:54:25,266 DEV : loss 0.21882730722427368 - f1-score (micro avg) 0.8018 2023-10-17 16:54:25,341 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:54:29,957 epoch 9 - iter 44/447 - loss 0.00911349 - time (sec): 4.61 - samples/sec: 1939.98 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:54:34,629 epoch 9 - iter 88/447 - loss 0.00622748 - time (sec): 9.28 - samples/sec: 2044.19 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:54:38,713 epoch 9 - iter 132/447 - loss 0.00477911 - time (sec): 13.37 - samples/sec: 2033.90 - lr: 0.000006 - momentum: 0.000000 2023-10-17 16:54:43,037 epoch 9 - iter 176/447 - loss 0.00545796 - time (sec): 17.69 - samples/sec: 1994.09 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:54:47,439 epoch 9 - iter 220/447 - loss 0.00514891 - time (sec): 22.09 - samples/sec: 2000.95 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:54:51,898 epoch 9 - iter 264/447 - loss 0.00530463 - time (sec): 26.55 - samples/sec: 1997.43 - lr: 0.000005 - momentum: 0.000000 2023-10-17 16:54:56,171 epoch 9 - iter 308/447 - loss 0.00519452 - time (sec): 30.83 - samples/sec: 1989.09 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:55:00,230 epoch 9 - iter 352/447 - loss 0.00567824 - time (sec): 34.89 - samples/sec: 1984.82 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:55:04,392 epoch 9 - iter 396/447 - loss 0.00625861 - time (sec): 39.05 - samples/sec: 1978.66 - lr: 0.000004 - momentum: 0.000000 2023-10-17 16:55:08,577 epoch 9 - iter 440/447 - loss 0.00656770 - time (sec): 43.23 - samples/sec: 1975.09 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:55:09,189 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:09,190 EPOCH 9 done: loss 0.0065 - lr: 0.000003 2023-10-17 16:55:20,752 DEV : loss 0.22874712944030762 - f1-score (micro avg) 0.8066 2023-10-17 16:55:20,817 saving best model 2023-10-17 16:55:22,230 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:55:26,626 epoch 10 - iter 44/447 - loss 0.00173260 - time (sec): 4.39 - samples/sec: 1946.13 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:55:30,957 epoch 10 - iter 88/447 - loss 0.00259927 - time (sec): 8.72 - samples/sec: 1922.78 - lr: 0.000003 - momentum: 0.000000 2023-10-17 16:55:34,844 epoch 10 - iter 132/447 - loss 0.00371022 - time (sec): 12.61 - samples/sec: 1951.89 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:55:39,216 epoch 10 - iter 176/447 - loss 0.00343244 - time (sec): 16.98 - samples/sec: 1973.14 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:55:43,550 epoch 10 - iter 220/447 - loss 0.00410654 - time (sec): 21.31 - samples/sec: 1982.22 - lr: 0.000002 - momentum: 0.000000 2023-10-17 16:55:48,037 epoch 10 - iter 264/447 - loss 0.00468622 - time (sec): 25.80 - samples/sec: 1992.41 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:55:51,992 epoch 10 - iter 308/447 - loss 0.00492962 - time (sec): 29.76 - samples/sec: 1994.33 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:55:56,019 epoch 10 - iter 352/447 - loss 0.00498152 - time (sec): 33.78 - samples/sec: 2022.40 - lr: 0.000001 - momentum: 0.000000 2023-10-17 16:55:59,943 epoch 10 - iter 396/447 - loss 0.00514324 - time (sec): 37.71 - samples/sec: 2033.47 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:56:04,083 epoch 10 - iter 440/447 - loss 0.00486802 - time (sec): 41.85 - samples/sec: 2035.70 - lr: 0.000000 - momentum: 0.000000 2023-10-17 16:56:04,730 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:04,731 EPOCH 10 done: loss 0.0048 - lr: 0.000000 2023-10-17 16:56:15,689 DEV : loss 0.23447194695472717 - f1-score (micro avg) 0.805 2023-10-17 16:56:16,295 ---------------------------------------------------------------------------------------------------- 2023-10-17 16:56:16,298 Loading model from best epoch ... 2023-10-17 16:56:19,028 SequenceTagger predicts: Dictionary with 21 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org, S-prod, B-prod, E-prod, I-prod, S-time, B-time, E-time, I-time 2023-10-17 16:56:25,007 Results: - F-score (micro) 0.7627 - F-score (macro) 0.6747 - Accuracy 0.6391 By class: precision recall f1-score support loc 0.8617 0.8574 0.8595 596 pers 0.7067 0.7958 0.7486 333 org 0.4667 0.5833 0.5185 132 prod 0.5965 0.5152 0.5528 66 time 0.6939 0.6939 0.6939 49 micro avg 0.7433 0.7832 0.7627 1176 macro avg 0.6651 0.6891 0.6747 1176 weighted avg 0.7516 0.7832 0.7657 1176 2023-10-17 16:56:25,007 ----------------------------------------------------------------------------------------------------