2023-10-13 09:35:45,278 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,279 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 09:35:45,279 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,279 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-13 09:35:45,279 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,279 Train: 1214 sentences 2023-10-13 09:35:45,279 (train_with_dev=False, train_with_test=False) 2023-10-13 09:35:45,279 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,279 Training Params: 2023-10-13 09:35:45,280 - learning_rate: "5e-05" 2023-10-13 09:35:45,280 - mini_batch_size: "4" 2023-10-13 09:35:45,280 - max_epochs: "10" 2023-10-13 09:35:45,280 - shuffle: "True" 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,280 Plugins: 2023-10-13 09:35:45,280 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,280 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 09:35:45,280 - metric: "('micro avg', 'f1-score')" 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,280 Computation: 2023-10-13 09:35:45,280 - compute on device: cuda:0 2023-10-13 09:35:45,280 - embedding storage: none 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,280 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:45,280 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:46,604 epoch 1 - iter 30/304 - loss 3.30844648 - time (sec): 1.32 - samples/sec: 2205.32 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:35:47,885 epoch 1 - iter 60/304 - loss 2.47744615 - time (sec): 2.60 - samples/sec: 2347.56 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:35:49,171 epoch 1 - iter 90/304 - loss 1.88649764 - time (sec): 3.89 - samples/sec: 2358.54 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:35:50,469 epoch 1 - iter 120/304 - loss 1.55352236 - time (sec): 5.19 - samples/sec: 2359.36 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:35:51,750 epoch 1 - iter 150/304 - loss 1.34488114 - time (sec): 6.47 - samples/sec: 2319.06 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:35:53,109 epoch 1 - iter 180/304 - loss 1.18097304 - time (sec): 7.83 - samples/sec: 2316.10 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:35:54,489 epoch 1 - iter 210/304 - loss 1.04955599 - time (sec): 9.21 - samples/sec: 2331.43 - lr: 0.000034 - momentum: 0.000000 2023-10-13 09:35:55,815 epoch 1 - iter 240/304 - loss 0.94981245 - time (sec): 10.53 - samples/sec: 2312.37 - lr: 0.000039 - momentum: 0.000000 2023-10-13 09:35:57,127 epoch 1 - iter 270/304 - loss 0.86651169 - time (sec): 11.85 - samples/sec: 2313.24 - lr: 0.000044 - momentum: 0.000000 2023-10-13 09:35:58,478 epoch 1 - iter 300/304 - loss 0.79597066 - time (sec): 13.20 - samples/sec: 2318.13 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:35:58,657 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:35:58,657 EPOCH 1 done: loss 0.7870 - lr: 0.000049 2023-10-13 09:35:59,620 DEV : loss 0.20084649324417114 - f1-score (micro avg) 0.6913 2023-10-13 09:35:59,627 saving best model 2023-10-13 09:36:00,026 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:01,361 epoch 2 - iter 30/304 - loss 0.19200632 - time (sec): 1.33 - samples/sec: 2308.54 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:36:02,731 epoch 2 - iter 60/304 - loss 0.16828937 - time (sec): 2.70 - samples/sec: 2283.67 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:36:04,072 epoch 2 - iter 90/304 - loss 0.16237859 - time (sec): 4.04 - samples/sec: 2245.17 - lr: 0.000048 - momentum: 0.000000 2023-10-13 09:36:05,422 epoch 2 - iter 120/304 - loss 0.16745475 - time (sec): 5.39 - samples/sec: 2244.03 - lr: 0.000048 - momentum: 0.000000 2023-10-13 09:36:06,761 epoch 2 - iter 150/304 - loss 0.15333330 - time (sec): 6.73 - samples/sec: 2250.71 - lr: 0.000047 - momentum: 0.000000 2023-10-13 09:36:08,061 epoch 2 - iter 180/304 - loss 0.15730871 - time (sec): 8.03 - samples/sec: 2278.36 - lr: 0.000047 - momentum: 0.000000 2023-10-13 09:36:09,437 epoch 2 - iter 210/304 - loss 0.15943170 - time (sec): 9.41 - samples/sec: 2289.73 - lr: 0.000046 - momentum: 0.000000 2023-10-13 09:36:10,820 epoch 2 - iter 240/304 - loss 0.15321193 - time (sec): 10.79 - samples/sec: 2262.83 - lr: 0.000046 - momentum: 0.000000 2023-10-13 09:36:12,248 epoch 2 - iter 270/304 - loss 0.14421916 - time (sec): 12.22 - samples/sec: 2260.01 - lr: 0.000045 - momentum: 0.000000 2023-10-13 09:36:13,581 epoch 2 - iter 300/304 - loss 0.14352072 - time (sec): 13.55 - samples/sec: 2266.80 - lr: 0.000045 - momentum: 0.000000 2023-10-13 09:36:13,748 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:13,748 EPOCH 2 done: loss 0.1435 - lr: 0.000045 2023-10-13 09:36:14,728 DEV : loss 0.1463729441165924 - f1-score (micro avg) 0.7753 2023-10-13 09:36:14,736 saving best model 2023-10-13 09:36:15,205 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:16,833 epoch 3 - iter 30/304 - loss 0.07246537 - time (sec): 1.63 - samples/sec: 1964.31 - lr: 0.000044 - momentum: 0.000000 2023-10-13 09:36:18,259 epoch 3 - iter 60/304 - loss 0.07664977 - time (sec): 3.05 - samples/sec: 2081.16 - lr: 0.000043 - momentum: 0.000000 2023-10-13 09:36:19,682 epoch 3 - iter 90/304 - loss 0.07996044 - time (sec): 4.47 - samples/sec: 2061.60 - lr: 0.000043 - momentum: 0.000000 2023-10-13 09:36:21,104 epoch 3 - iter 120/304 - loss 0.08063280 - time (sec): 5.90 - samples/sec: 2056.47 - lr: 0.000042 - momentum: 0.000000 2023-10-13 09:36:22,497 epoch 3 - iter 150/304 - loss 0.09827608 - time (sec): 7.29 - samples/sec: 2071.63 - lr: 0.000042 - momentum: 0.000000 2023-10-13 09:36:23,814 epoch 3 - iter 180/304 - loss 0.09656292 - time (sec): 8.61 - samples/sec: 2123.89 - lr: 0.000041 - momentum: 0.000000 2023-10-13 09:36:25,187 epoch 3 - iter 210/304 - loss 0.09694249 - time (sec): 9.98 - samples/sec: 2163.69 - lr: 0.000041 - momentum: 0.000000 2023-10-13 09:36:26,522 epoch 3 - iter 240/304 - loss 0.09064246 - time (sec): 11.31 - samples/sec: 2154.64 - lr: 0.000040 - momentum: 0.000000 2023-10-13 09:36:27,833 epoch 3 - iter 270/304 - loss 0.09634059 - time (sec): 12.63 - samples/sec: 2186.50 - lr: 0.000040 - momentum: 0.000000 2023-10-13 09:36:29,203 epoch 3 - iter 300/304 - loss 0.09475973 - time (sec): 14.00 - samples/sec: 2194.08 - lr: 0.000039 - momentum: 0.000000 2023-10-13 09:36:29,371 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:29,372 EPOCH 3 done: loss 0.0943 - lr: 0.000039 2023-10-13 09:36:30,334 DEV : loss 0.15982411801815033 - f1-score (micro avg) 0.8195 2023-10-13 09:36:30,341 saving best model 2023-10-13 09:36:30,906 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:32,238 epoch 4 - iter 30/304 - loss 0.07763557 - time (sec): 1.33 - samples/sec: 2492.42 - lr: 0.000038 - momentum: 0.000000 2023-10-13 09:36:33,588 epoch 4 - iter 60/304 - loss 0.08264473 - time (sec): 2.68 - samples/sec: 2302.87 - lr: 0.000038 - momentum: 0.000000 2023-10-13 09:36:34,928 epoch 4 - iter 90/304 - loss 0.08683235 - time (sec): 4.02 - samples/sec: 2327.07 - lr: 0.000037 - momentum: 0.000000 2023-10-13 09:36:36,284 epoch 4 - iter 120/304 - loss 0.08390417 - time (sec): 5.38 - samples/sec: 2298.71 - lr: 0.000037 - momentum: 0.000000 2023-10-13 09:36:37,629 epoch 4 - iter 150/304 - loss 0.07642428 - time (sec): 6.72 - samples/sec: 2281.56 - lr: 0.000036 - momentum: 0.000000 2023-10-13 09:36:38,956 epoch 4 - iter 180/304 - loss 0.06994247 - time (sec): 8.05 - samples/sec: 2312.31 - lr: 0.000036 - momentum: 0.000000 2023-10-13 09:36:40,298 epoch 4 - iter 210/304 - loss 0.06924919 - time (sec): 9.39 - samples/sec: 2287.20 - lr: 0.000035 - momentum: 0.000000 2023-10-13 09:36:41,615 epoch 4 - iter 240/304 - loss 0.06698426 - time (sec): 10.71 - samples/sec: 2291.01 - lr: 0.000035 - momentum: 0.000000 2023-10-13 09:36:42,924 epoch 4 - iter 270/304 - loss 0.06388884 - time (sec): 12.02 - samples/sec: 2294.02 - lr: 0.000034 - momentum: 0.000000 2023-10-13 09:36:44,266 epoch 4 - iter 300/304 - loss 0.06169522 - time (sec): 13.36 - samples/sec: 2290.72 - lr: 0.000033 - momentum: 0.000000 2023-10-13 09:36:44,450 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:44,450 EPOCH 4 done: loss 0.0610 - lr: 0.000033 2023-10-13 09:36:45,384 DEV : loss 0.18423306941986084 - f1-score (micro avg) 0.814 2023-10-13 09:36:45,392 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:46,755 epoch 5 - iter 30/304 - loss 0.05028610 - time (sec): 1.36 - samples/sec: 2229.81 - lr: 0.000033 - momentum: 0.000000 2023-10-13 09:36:48,122 epoch 5 - iter 60/304 - loss 0.04947102 - time (sec): 2.73 - samples/sec: 2256.86 - lr: 0.000032 - momentum: 0.000000 2023-10-13 09:36:49,457 epoch 5 - iter 90/304 - loss 0.05050518 - time (sec): 4.06 - samples/sec: 2266.03 - lr: 0.000032 - momentum: 0.000000 2023-10-13 09:36:50,777 epoch 5 - iter 120/304 - loss 0.04350965 - time (sec): 5.38 - samples/sec: 2257.66 - lr: 0.000031 - momentum: 0.000000 2023-10-13 09:36:52,089 epoch 5 - iter 150/304 - loss 0.03679437 - time (sec): 6.70 - samples/sec: 2262.74 - lr: 0.000031 - momentum: 0.000000 2023-10-13 09:36:53,404 epoch 5 - iter 180/304 - loss 0.04199881 - time (sec): 8.01 - samples/sec: 2281.64 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:36:54,733 epoch 5 - iter 210/304 - loss 0.04700527 - time (sec): 9.34 - samples/sec: 2279.09 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:36:56,064 epoch 5 - iter 240/304 - loss 0.04782724 - time (sec): 10.67 - samples/sec: 2279.95 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:36:57,402 epoch 5 - iter 270/304 - loss 0.04987184 - time (sec): 12.01 - samples/sec: 2297.27 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:36:58,720 epoch 5 - iter 300/304 - loss 0.04794231 - time (sec): 13.33 - samples/sec: 2299.61 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:36:58,899 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:36:58,900 EPOCH 5 done: loss 0.0474 - lr: 0.000028 2023-10-13 09:36:59,814 DEV : loss 0.2034173458814621 - f1-score (micro avg) 0.8079 2023-10-13 09:36:59,822 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:01,162 epoch 6 - iter 30/304 - loss 0.03370652 - time (sec): 1.34 - samples/sec: 2199.02 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:37:02,493 epoch 6 - iter 60/304 - loss 0.03038258 - time (sec): 2.67 - samples/sec: 2241.32 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:37:03,822 epoch 6 - iter 90/304 - loss 0.02579266 - time (sec): 4.00 - samples/sec: 2230.12 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:37:05,164 epoch 6 - iter 120/304 - loss 0.02501523 - time (sec): 5.34 - samples/sec: 2242.69 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:37:06,507 epoch 6 - iter 150/304 - loss 0.02661085 - time (sec): 6.68 - samples/sec: 2268.06 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:37:07,848 epoch 6 - iter 180/304 - loss 0.02675219 - time (sec): 8.02 - samples/sec: 2268.11 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:37:09,180 epoch 6 - iter 210/304 - loss 0.02858697 - time (sec): 9.36 - samples/sec: 2294.14 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:37:10,525 epoch 6 - iter 240/304 - loss 0.02884368 - time (sec): 10.70 - samples/sec: 2283.41 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:37:11,849 epoch 6 - iter 270/304 - loss 0.03106011 - time (sec): 12.03 - samples/sec: 2290.17 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:37:13,185 epoch 6 - iter 300/304 - loss 0.03205796 - time (sec): 13.36 - samples/sec: 2300.29 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:37:13,362 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:13,362 EPOCH 6 done: loss 0.0318 - lr: 0.000022 2023-10-13 09:37:14,394 DEV : loss 0.19591137766838074 - f1-score (micro avg) 0.8301 2023-10-13 09:37:14,404 saving best model 2023-10-13 09:37:14,928 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:16,549 epoch 7 - iter 30/304 - loss 0.02164034 - time (sec): 1.62 - samples/sec: 1859.81 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:37:18,193 epoch 7 - iter 60/304 - loss 0.01343970 - time (sec): 3.26 - samples/sec: 1860.57 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:37:19,832 epoch 7 - iter 90/304 - loss 0.02087442 - time (sec): 4.90 - samples/sec: 1833.43 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:37:21,368 epoch 7 - iter 120/304 - loss 0.02273139 - time (sec): 6.44 - samples/sec: 1863.07 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:37:22,701 epoch 7 - iter 150/304 - loss 0.02022901 - time (sec): 7.77 - samples/sec: 1955.19 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:37:24,048 epoch 7 - iter 180/304 - loss 0.02106458 - time (sec): 9.12 - samples/sec: 2006.75 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:37:25,402 epoch 7 - iter 210/304 - loss 0.02054990 - time (sec): 10.47 - samples/sec: 2059.39 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:37:26,744 epoch 7 - iter 240/304 - loss 0.02780429 - time (sec): 11.81 - samples/sec: 2083.22 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:37:28,106 epoch 7 - iter 270/304 - loss 0.02614677 - time (sec): 13.18 - samples/sec: 2097.30 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:37:29,437 epoch 7 - iter 300/304 - loss 0.02636916 - time (sec): 14.51 - samples/sec: 2108.46 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:37:29,611 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:29,611 EPOCH 7 done: loss 0.0277 - lr: 0.000017 2023-10-13 09:37:30,599 DEV : loss 0.19891534745693207 - f1-score (micro avg) 0.845 2023-10-13 09:37:30,607 saving best model 2023-10-13 09:37:31,154 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:32,478 epoch 8 - iter 30/304 - loss 0.01006643 - time (sec): 1.32 - samples/sec: 2287.55 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:37:33,832 epoch 8 - iter 60/304 - loss 0.01023392 - time (sec): 2.68 - samples/sec: 2329.60 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:37:35,176 epoch 8 - iter 90/304 - loss 0.02035441 - time (sec): 4.02 - samples/sec: 2293.03 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:37:36,531 epoch 8 - iter 120/304 - loss 0.01682450 - time (sec): 5.38 - samples/sec: 2273.58 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:37:37,884 epoch 8 - iter 150/304 - loss 0.01663843 - time (sec): 6.73 - samples/sec: 2301.05 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:37:39,188 epoch 8 - iter 180/304 - loss 0.01435199 - time (sec): 8.03 - samples/sec: 2301.63 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:37:40,504 epoch 8 - iter 210/304 - loss 0.01286606 - time (sec): 9.35 - samples/sec: 2282.79 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:37:41,819 epoch 8 - iter 240/304 - loss 0.01260687 - time (sec): 10.66 - samples/sec: 2280.08 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:37:43,170 epoch 8 - iter 270/304 - loss 0.01466054 - time (sec): 12.01 - samples/sec: 2278.13 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:37:44,543 epoch 8 - iter 300/304 - loss 0.01566152 - time (sec): 13.39 - samples/sec: 2285.03 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:37:44,711 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:44,711 EPOCH 8 done: loss 0.0160 - lr: 0.000011 2023-10-13 09:37:45,711 DEV : loss 0.2116977423429489 - f1-score (micro avg) 0.8386 2023-10-13 09:37:45,719 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:47,024 epoch 9 - iter 30/304 - loss 0.00668858 - time (sec): 1.30 - samples/sec: 2137.08 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:37:48,347 epoch 9 - iter 60/304 - loss 0.00705918 - time (sec): 2.63 - samples/sec: 2223.29 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:37:49,685 epoch 9 - iter 90/304 - loss 0.01013676 - time (sec): 3.97 - samples/sec: 2315.73 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:37:50,996 epoch 9 - iter 120/304 - loss 0.01274867 - time (sec): 5.28 - samples/sec: 2247.60 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:37:52,339 epoch 9 - iter 150/304 - loss 0.01321220 - time (sec): 6.62 - samples/sec: 2235.12 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:37:53,658 epoch 9 - iter 180/304 - loss 0.01345744 - time (sec): 7.94 - samples/sec: 2280.51 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:37:54,988 epoch 9 - iter 210/304 - loss 0.01133553 - time (sec): 9.27 - samples/sec: 2324.67 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:37:56,316 epoch 9 - iter 240/304 - loss 0.01009754 - time (sec): 10.60 - samples/sec: 2306.14 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:37:57,631 epoch 9 - iter 270/304 - loss 0.00997680 - time (sec): 11.91 - samples/sec: 2310.85 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:37:58,955 epoch 9 - iter 300/304 - loss 0.01010998 - time (sec): 13.23 - samples/sec: 2314.97 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:37:59,123 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:37:59,123 EPOCH 9 done: loss 0.0101 - lr: 0.000006 2023-10-13 09:38:00,316 DEV : loss 0.21634384989738464 - f1-score (micro avg) 0.8427 2023-10-13 09:38:00,326 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:01,931 epoch 10 - iter 30/304 - loss 0.00792384 - time (sec): 1.60 - samples/sec: 1888.97 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:38:03,341 epoch 10 - iter 60/304 - loss 0.00514030 - time (sec): 3.01 - samples/sec: 1959.23 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:38:04,711 epoch 10 - iter 90/304 - loss 0.00487148 - time (sec): 4.38 - samples/sec: 1993.86 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:38:06,075 epoch 10 - iter 120/304 - loss 0.00415204 - time (sec): 5.75 - samples/sec: 2065.00 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:38:07,525 epoch 10 - iter 150/304 - loss 0.00576126 - time (sec): 7.20 - samples/sec: 2089.46 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:38:09,000 epoch 10 - iter 180/304 - loss 0.00474623 - time (sec): 8.67 - samples/sec: 2115.28 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:38:10,708 epoch 10 - iter 210/304 - loss 0.00579619 - time (sec): 10.38 - samples/sec: 2057.31 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:38:12,285 epoch 10 - iter 240/304 - loss 0.00859001 - time (sec): 11.96 - samples/sec: 2037.86 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:38:13,831 epoch 10 - iter 270/304 - loss 0.00907432 - time (sec): 13.50 - samples/sec: 2019.42 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:38:15,394 epoch 10 - iter 300/304 - loss 0.00870551 - time (sec): 15.07 - samples/sec: 2033.61 - lr: 0.000000 - momentum: 0.000000 2023-10-13 09:38:15,592 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:15,593 EPOCH 10 done: loss 0.0086 - lr: 0.000000 2023-10-13 09:38:16,513 DEV : loss 0.21387432515621185 - f1-score (micro avg) 0.8528 2023-10-13 09:38:16,523 saving best model 2023-10-13 09:38:17,416 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:17,417 Loading model from best epoch ... 2023-10-13 09:38:19,210 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-13 09:38:20,032 Results: - F-score (micro) 0.7973 - F-score (macro) 0.5914 - Accuracy 0.672 By class: precision recall f1-score support scope 0.7108 0.7815 0.7445 151 work 0.7589 0.8947 0.8213 95 pers 0.8491 0.9375 0.8911 96 loc 0.4000 0.6667 0.5000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7526 0.8477 0.7973 348 macro avg 0.5438 0.6561 0.5914 348 weighted avg 0.7533 0.8477 0.7974 348 2023-10-13 09:38:20,032 ----------------------------------------------------------------------------------------------------