2023-10-16 23:28:46,167 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Train: 6183 sentences 2023-10-16 23:28:46,168 (train_with_dev=False, train_with_test=False) 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Training Params: 2023-10-16 23:28:46,168 - learning_rate: "5e-05" 2023-10-16 23:28:46,168 - mini_batch_size: "8" 2023-10-16 23:28:46,168 - max_epochs: "10" 2023-10-16 23:28:46,168 - shuffle: "True" 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Plugins: 2023-10-16 23:28:46,168 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 23:28:46,168 - metric: "('micro avg', 'f1-score')" 2023-10-16 23:28:46,168 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,168 Computation: 2023-10-16 23:28:46,169 - compute on device: cuda:0 2023-10-16 23:28:46,169 - embedding storage: none 2023-10-16 23:28:46,169 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,169 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-16 23:28:46,169 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:46,169 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:28:50,364 epoch 1 - iter 77/773 - loss 2.05821625 - time (sec): 4.19 - samples/sec: 2847.57 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:28:54,564 epoch 1 - iter 154/773 - loss 1.12057768 - time (sec): 8.39 - samples/sec: 2887.89 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:28:58,994 epoch 1 - iter 231/773 - loss 0.79720912 - time (sec): 12.82 - samples/sec: 2842.82 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:29:03,261 epoch 1 - iter 308/773 - loss 0.63122125 - time (sec): 17.09 - samples/sec: 2838.16 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:29:07,541 epoch 1 - iter 385/773 - loss 0.52681358 - time (sec): 21.37 - samples/sec: 2836.09 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:29:12,095 epoch 1 - iter 462/773 - loss 0.45704218 - time (sec): 25.93 - samples/sec: 2827.40 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:29:16,506 epoch 1 - iter 539/773 - loss 0.40967278 - time (sec): 30.34 - samples/sec: 2815.64 - lr: 0.000035 - momentum: 0.000000 2023-10-16 23:29:21,231 epoch 1 - iter 616/773 - loss 0.36474823 - time (sec): 35.06 - samples/sec: 2836.79 - lr: 0.000040 - momentum: 0.000000 2023-10-16 23:29:25,556 epoch 1 - iter 693/773 - loss 0.33633044 - time (sec): 39.39 - samples/sec: 2827.61 - lr: 0.000045 - momentum: 0.000000 2023-10-16 23:29:30,163 epoch 1 - iter 770/773 - loss 0.31383826 - time (sec): 43.99 - samples/sec: 2815.87 - lr: 0.000050 - momentum: 0.000000 2023-10-16 23:29:30,314 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:29:30,314 EPOCH 1 done: loss 0.3128 - lr: 0.000050 2023-10-16 23:29:32,330 DEV : loss 0.07684116810560226 - f1-score (micro avg) 0.6582 2023-10-16 23:29:32,344 saving best model 2023-10-16 23:29:32,700 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:29:37,341 epoch 2 - iter 77/773 - loss 0.08541874 - time (sec): 4.64 - samples/sec: 2806.77 - lr: 0.000049 - momentum: 0.000000 2023-10-16 23:29:42,127 epoch 2 - iter 154/773 - loss 0.09150656 - time (sec): 9.43 - samples/sec: 2780.41 - lr: 0.000049 - momentum: 0.000000 2023-10-16 23:29:46,812 epoch 2 - iter 231/773 - loss 0.08021713 - time (sec): 14.11 - samples/sec: 2809.83 - lr: 0.000048 - momentum: 0.000000 2023-10-16 23:29:51,335 epoch 2 - iter 308/773 - loss 0.07652856 - time (sec): 18.63 - samples/sec: 2821.09 - lr: 0.000048 - momentum: 0.000000 2023-10-16 23:29:55,825 epoch 2 - iter 385/773 - loss 0.07556188 - time (sec): 23.12 - samples/sec: 2797.92 - lr: 0.000047 - momentum: 0.000000 2023-10-16 23:30:00,092 epoch 2 - iter 462/773 - loss 0.07533380 - time (sec): 27.39 - samples/sec: 2780.83 - lr: 0.000047 - momentum: 0.000000 2023-10-16 23:30:04,556 epoch 2 - iter 539/773 - loss 0.07534503 - time (sec): 31.85 - samples/sec: 2748.18 - lr: 0.000046 - momentum: 0.000000 2023-10-16 23:30:09,069 epoch 2 - iter 616/773 - loss 0.07517209 - time (sec): 36.37 - samples/sec: 2748.61 - lr: 0.000046 - momentum: 0.000000 2023-10-16 23:30:13,393 epoch 2 - iter 693/773 - loss 0.07516803 - time (sec): 40.69 - samples/sec: 2751.22 - lr: 0.000045 - momentum: 0.000000 2023-10-16 23:30:17,600 epoch 2 - iter 770/773 - loss 0.07576881 - time (sec): 44.90 - samples/sec: 2760.21 - lr: 0.000044 - momentum: 0.000000 2023-10-16 23:30:17,748 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:30:17,748 EPOCH 2 done: loss 0.0758 - lr: 0.000044 2023-10-16 23:30:19,803 DEV : loss 0.055554550141096115 - f1-score (micro avg) 0.764 2023-10-16 23:30:19,818 saving best model 2023-10-16 23:30:20,625 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:30:25,385 epoch 3 - iter 77/773 - loss 0.04908080 - time (sec): 4.76 - samples/sec: 2775.34 - lr: 0.000044 - momentum: 0.000000 2023-10-16 23:30:30,059 epoch 3 - iter 154/773 - loss 0.04913465 - time (sec): 9.43 - samples/sec: 2795.74 - lr: 0.000043 - momentum: 0.000000 2023-10-16 23:30:34,434 epoch 3 - iter 231/773 - loss 0.04947207 - time (sec): 13.80 - samples/sec: 2780.30 - lr: 0.000043 - momentum: 0.000000 2023-10-16 23:30:38,897 epoch 3 - iter 308/773 - loss 0.04908871 - time (sec): 18.27 - samples/sec: 2821.00 - lr: 0.000042 - momentum: 0.000000 2023-10-16 23:30:43,515 epoch 3 - iter 385/773 - loss 0.05031143 - time (sec): 22.89 - samples/sec: 2790.79 - lr: 0.000042 - momentum: 0.000000 2023-10-16 23:30:48,025 epoch 3 - iter 462/773 - loss 0.04978288 - time (sec): 27.40 - samples/sec: 2763.41 - lr: 0.000041 - momentum: 0.000000 2023-10-16 23:30:52,500 epoch 3 - iter 539/773 - loss 0.04951112 - time (sec): 31.87 - samples/sec: 2760.48 - lr: 0.000041 - momentum: 0.000000 2023-10-16 23:30:56,878 epoch 3 - iter 616/773 - loss 0.04912279 - time (sec): 36.25 - samples/sec: 2749.09 - lr: 0.000040 - momentum: 0.000000 2023-10-16 23:31:01,417 epoch 3 - iter 693/773 - loss 0.05080718 - time (sec): 40.79 - samples/sec: 2744.16 - lr: 0.000039 - momentum: 0.000000 2023-10-16 23:31:05,798 epoch 3 - iter 770/773 - loss 0.05022563 - time (sec): 45.17 - samples/sec: 2742.14 - lr: 0.000039 - momentum: 0.000000 2023-10-16 23:31:05,956 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:31:05,956 EPOCH 3 done: loss 0.0501 - lr: 0.000039 2023-10-16 23:31:08,014 DEV : loss 0.07521194219589233 - f1-score (micro avg) 0.7692 2023-10-16 23:31:08,027 saving best model 2023-10-16 23:31:08,505 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:31:13,235 epoch 4 - iter 77/773 - loss 0.02831762 - time (sec): 4.73 - samples/sec: 2810.49 - lr: 0.000038 - momentum: 0.000000 2023-10-16 23:31:17,778 epoch 4 - iter 154/773 - loss 0.02666079 - time (sec): 9.27 - samples/sec: 2723.60 - lr: 0.000038 - momentum: 0.000000 2023-10-16 23:31:22,399 epoch 4 - iter 231/773 - loss 0.02794596 - time (sec): 13.89 - samples/sec: 2758.20 - lr: 0.000037 - momentum: 0.000000 2023-10-16 23:31:26,765 epoch 4 - iter 308/773 - loss 0.02787832 - time (sec): 18.26 - samples/sec: 2750.75 - lr: 0.000037 - momentum: 0.000000 2023-10-16 23:31:31,084 epoch 4 - iter 385/773 - loss 0.02779843 - time (sec): 22.58 - samples/sec: 2750.45 - lr: 0.000036 - momentum: 0.000000 2023-10-16 23:31:35,850 epoch 4 - iter 462/773 - loss 0.02839736 - time (sec): 27.34 - samples/sec: 2756.71 - lr: 0.000036 - momentum: 0.000000 2023-10-16 23:31:40,234 epoch 4 - iter 539/773 - loss 0.02850031 - time (sec): 31.73 - samples/sec: 2757.04 - lr: 0.000035 - momentum: 0.000000 2023-10-16 23:31:44,818 epoch 4 - iter 616/773 - loss 0.02982575 - time (sec): 36.31 - samples/sec: 2760.38 - lr: 0.000034 - momentum: 0.000000 2023-10-16 23:31:49,130 epoch 4 - iter 693/773 - loss 0.03016989 - time (sec): 40.62 - samples/sec: 2764.26 - lr: 0.000034 - momentum: 0.000000 2023-10-16 23:31:53,448 epoch 4 - iter 770/773 - loss 0.03119377 - time (sec): 44.94 - samples/sec: 2758.67 - lr: 0.000033 - momentum: 0.000000 2023-10-16 23:31:53,599 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:31:53,599 EPOCH 4 done: loss 0.0314 - lr: 0.000033 2023-10-16 23:31:55,691 DEV : loss 0.07656633853912354 - f1-score (micro avg) 0.7416 2023-10-16 23:31:55,705 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:32:00,011 epoch 5 - iter 77/773 - loss 0.01674268 - time (sec): 4.30 - samples/sec: 2751.58 - lr: 0.000033 - momentum: 0.000000 2023-10-16 23:32:04,468 epoch 5 - iter 154/773 - loss 0.02139996 - time (sec): 8.76 - samples/sec: 2772.64 - lr: 0.000032 - momentum: 0.000000 2023-10-16 23:32:09,064 epoch 5 - iter 231/773 - loss 0.02191202 - time (sec): 13.36 - samples/sec: 2745.80 - lr: 0.000032 - momentum: 0.000000 2023-10-16 23:32:13,609 epoch 5 - iter 308/773 - loss 0.02319700 - time (sec): 17.90 - samples/sec: 2760.51 - lr: 0.000031 - momentum: 0.000000 2023-10-16 23:32:18,071 epoch 5 - iter 385/773 - loss 0.02257452 - time (sec): 22.37 - samples/sec: 2753.35 - lr: 0.000031 - momentum: 0.000000 2023-10-16 23:32:22,665 epoch 5 - iter 462/773 - loss 0.02279902 - time (sec): 26.96 - samples/sec: 2754.67 - lr: 0.000030 - momentum: 0.000000 2023-10-16 23:32:27,378 epoch 5 - iter 539/773 - loss 0.02286138 - time (sec): 31.67 - samples/sec: 2741.01 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:32:31,807 epoch 5 - iter 616/773 - loss 0.02209983 - time (sec): 36.10 - samples/sec: 2738.25 - lr: 0.000029 - momentum: 0.000000 2023-10-16 23:32:36,403 epoch 5 - iter 693/773 - loss 0.02219041 - time (sec): 40.70 - samples/sec: 2729.94 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:32:40,881 epoch 5 - iter 770/773 - loss 0.02191350 - time (sec): 45.17 - samples/sec: 2745.07 - lr: 0.000028 - momentum: 0.000000 2023-10-16 23:32:41,031 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:32:41,031 EPOCH 5 done: loss 0.0219 - lr: 0.000028 2023-10-16 23:32:43,078 DEV : loss 0.08744674175977707 - f1-score (micro avg) 0.7884 2023-10-16 23:32:43,092 saving best model 2023-10-16 23:32:43,572 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:32:48,082 epoch 6 - iter 77/773 - loss 0.01910359 - time (sec): 4.51 - samples/sec: 2654.57 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:32:52,487 epoch 6 - iter 154/773 - loss 0.01365856 - time (sec): 8.91 - samples/sec: 2733.99 - lr: 0.000027 - momentum: 0.000000 2023-10-16 23:32:57,069 epoch 6 - iter 231/773 - loss 0.01405153 - time (sec): 13.49 - samples/sec: 2743.22 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:33:01,369 epoch 6 - iter 308/773 - loss 0.01462145 - time (sec): 17.79 - samples/sec: 2797.08 - lr: 0.000026 - momentum: 0.000000 2023-10-16 23:33:05,824 epoch 6 - iter 385/773 - loss 0.01624866 - time (sec): 22.25 - samples/sec: 2770.71 - lr: 0.000025 - momentum: 0.000000 2023-10-16 23:33:10,316 epoch 6 - iter 462/773 - loss 0.01588902 - time (sec): 26.74 - samples/sec: 2764.50 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:33:15,099 epoch 6 - iter 539/773 - loss 0.01681257 - time (sec): 31.52 - samples/sec: 2777.37 - lr: 0.000024 - momentum: 0.000000 2023-10-16 23:33:19,716 epoch 6 - iter 616/773 - loss 0.01638494 - time (sec): 36.14 - samples/sec: 2769.26 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:33:24,073 epoch 6 - iter 693/773 - loss 0.01586282 - time (sec): 40.50 - samples/sec: 2763.41 - lr: 0.000023 - momentum: 0.000000 2023-10-16 23:33:28,420 epoch 6 - iter 770/773 - loss 0.01618702 - time (sec): 44.84 - samples/sec: 2759.69 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:33:28,594 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:33:28,594 EPOCH 6 done: loss 0.0161 - lr: 0.000022 2023-10-16 23:33:30,724 DEV : loss 0.09938821196556091 - f1-score (micro avg) 0.7923 2023-10-16 23:33:30,738 saving best model 2023-10-16 23:33:31,222 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:33:35,653 epoch 7 - iter 77/773 - loss 0.01423563 - time (sec): 4.43 - samples/sec: 2813.92 - lr: 0.000022 - momentum: 0.000000 2023-10-16 23:33:40,674 epoch 7 - iter 154/773 - loss 0.01095699 - time (sec): 9.45 - samples/sec: 2689.81 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:33:45,071 epoch 7 - iter 231/773 - loss 0.01206701 - time (sec): 13.85 - samples/sec: 2715.66 - lr: 0.000021 - momentum: 0.000000 2023-10-16 23:33:49,552 epoch 7 - iter 308/773 - loss 0.01215201 - time (sec): 18.33 - samples/sec: 2722.00 - lr: 0.000020 - momentum: 0.000000 2023-10-16 23:33:53,801 epoch 7 - iter 385/773 - loss 0.01162227 - time (sec): 22.58 - samples/sec: 2718.52 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:33:58,192 epoch 7 - iter 462/773 - loss 0.01129353 - time (sec): 26.97 - samples/sec: 2718.33 - lr: 0.000019 - momentum: 0.000000 2023-10-16 23:34:02,718 epoch 7 - iter 539/773 - loss 0.01203718 - time (sec): 31.49 - samples/sec: 2712.16 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:34:07,327 epoch 7 - iter 616/773 - loss 0.01117885 - time (sec): 36.10 - samples/sec: 2694.16 - lr: 0.000018 - momentum: 0.000000 2023-10-16 23:34:11,793 epoch 7 - iter 693/773 - loss 0.01110729 - time (sec): 40.57 - samples/sec: 2694.19 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:34:16,679 epoch 7 - iter 770/773 - loss 0.01111937 - time (sec): 45.46 - samples/sec: 2722.68 - lr: 0.000017 - momentum: 0.000000 2023-10-16 23:34:16,864 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:34:16,864 EPOCH 7 done: loss 0.0111 - lr: 0.000017 2023-10-16 23:34:18,997 DEV : loss 0.09087057411670685 - f1-score (micro avg) 0.7918 2023-10-16 23:34:19,011 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:34:23,489 epoch 8 - iter 77/773 - loss 0.00268183 - time (sec): 4.48 - samples/sec: 2682.46 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:34:27,889 epoch 8 - iter 154/773 - loss 0.00312117 - time (sec): 8.88 - samples/sec: 2738.50 - lr: 0.000016 - momentum: 0.000000 2023-10-16 23:34:32,381 epoch 8 - iter 231/773 - loss 0.00333409 - time (sec): 13.37 - samples/sec: 2749.16 - lr: 0.000015 - momentum: 0.000000 2023-10-16 23:34:37,033 epoch 8 - iter 308/773 - loss 0.00376627 - time (sec): 18.02 - samples/sec: 2741.37 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:34:41,429 epoch 8 - iter 385/773 - loss 0.00475014 - time (sec): 22.42 - samples/sec: 2745.17 - lr: 0.000014 - momentum: 0.000000 2023-10-16 23:34:45,998 epoch 8 - iter 462/773 - loss 0.00587725 - time (sec): 26.99 - samples/sec: 2736.68 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:34:50,586 epoch 8 - iter 539/773 - loss 0.00660246 - time (sec): 31.57 - samples/sec: 2744.37 - lr: 0.000013 - momentum: 0.000000 2023-10-16 23:34:55,103 epoch 8 - iter 616/773 - loss 0.00647718 - time (sec): 36.09 - samples/sec: 2753.72 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:34:59,555 epoch 8 - iter 693/773 - loss 0.00616577 - time (sec): 40.54 - samples/sec: 2753.41 - lr: 0.000012 - momentum: 0.000000 2023-10-16 23:35:03,859 epoch 8 - iter 770/773 - loss 0.00609504 - time (sec): 44.85 - samples/sec: 2760.88 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:35:04,016 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:35:04,016 EPOCH 8 done: loss 0.0061 - lr: 0.000011 2023-10-16 23:35:06,122 DEV : loss 0.10900219529867172 - f1-score (micro avg) 0.8094 2023-10-16 23:35:06,136 saving best model 2023-10-16 23:35:06,602 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:35:11,205 epoch 9 - iter 77/773 - loss 0.00109865 - time (sec): 4.60 - samples/sec: 2716.52 - lr: 0.000011 - momentum: 0.000000 2023-10-16 23:35:15,498 epoch 9 - iter 154/773 - loss 0.00300943 - time (sec): 8.89 - samples/sec: 2727.91 - lr: 0.000010 - momentum: 0.000000 2023-10-16 23:35:20,024 epoch 9 - iter 231/773 - loss 0.00335115 - time (sec): 13.42 - samples/sec: 2794.60 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:35:24,367 epoch 9 - iter 308/773 - loss 0.00303445 - time (sec): 17.76 - samples/sec: 2777.65 - lr: 0.000009 - momentum: 0.000000 2023-10-16 23:35:28,759 epoch 9 - iter 385/773 - loss 0.00350210 - time (sec): 22.16 - samples/sec: 2764.14 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:35:33,459 epoch 9 - iter 462/773 - loss 0.00424632 - time (sec): 26.86 - samples/sec: 2774.78 - lr: 0.000008 - momentum: 0.000000 2023-10-16 23:35:37,906 epoch 9 - iter 539/773 - loss 0.00399565 - time (sec): 31.30 - samples/sec: 2778.55 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:35:42,541 epoch 9 - iter 616/773 - loss 0.00372286 - time (sec): 35.94 - samples/sec: 2766.19 - lr: 0.000007 - momentum: 0.000000 2023-10-16 23:35:47,163 epoch 9 - iter 693/773 - loss 0.00376644 - time (sec): 40.56 - samples/sec: 2764.44 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:35:51,583 epoch 9 - iter 770/773 - loss 0.00383131 - time (sec): 44.98 - samples/sec: 2746.64 - lr: 0.000006 - momentum: 0.000000 2023-10-16 23:35:51,801 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:35:51,801 EPOCH 9 done: loss 0.0038 - lr: 0.000006 2023-10-16 23:35:53,905 DEV : loss 0.10695482790470123 - f1-score (micro avg) 0.8058 2023-10-16 23:35:53,918 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:35:58,505 epoch 10 - iter 77/773 - loss 0.00095541 - time (sec): 4.59 - samples/sec: 2808.41 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:36:03,206 epoch 10 - iter 154/773 - loss 0.00147003 - time (sec): 9.29 - samples/sec: 2749.68 - lr: 0.000005 - momentum: 0.000000 2023-10-16 23:36:07,730 epoch 10 - iter 231/773 - loss 0.00132085 - time (sec): 13.81 - samples/sec: 2778.54 - lr: 0.000004 - momentum: 0.000000 2023-10-16 23:36:12,146 epoch 10 - iter 308/773 - loss 0.00198611 - time (sec): 18.23 - samples/sec: 2773.05 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:36:16,876 epoch 10 - iter 385/773 - loss 0.00180596 - time (sec): 22.96 - samples/sec: 2758.06 - lr: 0.000003 - momentum: 0.000000 2023-10-16 23:36:21,168 epoch 10 - iter 462/773 - loss 0.00188308 - time (sec): 27.25 - samples/sec: 2749.43 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:36:25,507 epoch 10 - iter 539/773 - loss 0.00185097 - time (sec): 31.59 - samples/sec: 2758.86 - lr: 0.000002 - momentum: 0.000000 2023-10-16 23:36:30,388 epoch 10 - iter 616/773 - loss 0.00189967 - time (sec): 36.47 - samples/sec: 2736.70 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:36:34,709 epoch 10 - iter 693/773 - loss 0.00210131 - time (sec): 40.79 - samples/sec: 2730.26 - lr: 0.000001 - momentum: 0.000000 2023-10-16 23:36:39,163 epoch 10 - iter 770/773 - loss 0.00223480 - time (sec): 45.24 - samples/sec: 2734.71 - lr: 0.000000 - momentum: 0.000000 2023-10-16 23:36:39,351 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:36:39,351 EPOCH 10 done: loss 0.0022 - lr: 0.000000 2023-10-16 23:36:41,341 DEV : loss 0.10907312482595444 - f1-score (micro avg) 0.8082 2023-10-16 23:36:41,688 ---------------------------------------------------------------------------------------------------- 2023-10-16 23:36:41,689 Loading model from best epoch ... 2023-10-16 23:36:43,334 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-16 23:36:49,459 Results: - F-score (micro) 0.8046 - F-score (macro) 0.7284 - Accuracy 0.6925 By class: precision recall f1-score support LOC 0.8711 0.8288 0.8494 946 BUILDING 0.5959 0.6216 0.6085 185 STREET 0.7407 0.7143 0.7273 56 micro avg 0.8187 0.7911 0.8046 1187 macro avg 0.7359 0.7216 0.7284 1187 weighted avg 0.8221 0.7911 0.8061 1187 2023-10-16 23:36:49,460 ----------------------------------------------------------------------------------------------------