2023-10-13 09:51:37,154 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,155 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Train: 1214 sentences 2023-10-13 09:51:37,156 (train_with_dev=False, train_with_test=False) 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Training Params: 2023-10-13 09:51:37,156 - learning_rate: "5e-05" 2023-10-13 09:51:37,156 - mini_batch_size: "8" 2023-10-13 09:51:37,156 - max_epochs: "10" 2023-10-13 09:51:37,156 - shuffle: "True" 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Plugins: 2023-10-13 09:51:37,156 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 09:51:37,156 - metric: "('micro avg', 'f1-score')" 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Computation: 2023-10-13 09:51:37,156 - compute on device: cuda:0 2023-10-13 09:51:37,156 - embedding storage: none 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:37,156 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:38,052 epoch 1 - iter 15/152 - loss 3.18023791 - time (sec): 0.89 - samples/sec: 3154.22 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:51:39,016 epoch 1 - iter 30/152 - loss 2.78060410 - time (sec): 1.86 - samples/sec: 3367.93 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:51:39,941 epoch 1 - iter 45/152 - loss 2.13363903 - time (sec): 2.78 - samples/sec: 3390.41 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:51:40,807 epoch 1 - iter 60/152 - loss 1.81902299 - time (sec): 3.65 - samples/sec: 3425.26 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:51:41,632 epoch 1 - iter 75/152 - loss 1.60597403 - time (sec): 4.47 - samples/sec: 3436.06 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:51:42,474 epoch 1 - iter 90/152 - loss 1.42563076 - time (sec): 5.32 - samples/sec: 3437.03 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:51:43,351 epoch 1 - iter 105/152 - loss 1.27918897 - time (sec): 6.19 - samples/sec: 3430.91 - lr: 0.000034 - momentum: 0.000000 2023-10-13 09:51:44,229 epoch 1 - iter 120/152 - loss 1.14879180 - time (sec): 7.07 - samples/sec: 3460.88 - lr: 0.000039 - momentum: 0.000000 2023-10-13 09:51:45,124 epoch 1 - iter 135/152 - loss 1.05410170 - time (sec): 7.97 - samples/sec: 3447.54 - lr: 0.000044 - momentum: 0.000000 2023-10-13 09:51:46,085 epoch 1 - iter 150/152 - loss 0.96669425 - time (sec): 8.93 - samples/sec: 3436.17 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:51:46,193 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:46,193 EPOCH 1 done: loss 0.9589 - lr: 0.000049 2023-10-13 09:51:47,120 DEV : loss 0.2219340056180954 - f1-score (micro avg) 0.5476 2023-10-13 09:51:47,127 saving best model 2023-10-13 09:51:47,592 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:48,497 epoch 2 - iter 15/152 - loss 0.14177558 - time (sec): 0.90 - samples/sec: 3311.13 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:51:49,385 epoch 2 - iter 30/152 - loss 0.15297741 - time (sec): 1.79 - samples/sec: 3270.88 - lr: 0.000049 - momentum: 0.000000 2023-10-13 09:51:50,240 epoch 2 - iter 45/152 - loss 0.17527777 - time (sec): 2.65 - samples/sec: 3374.49 - lr: 0.000048 - momentum: 0.000000 2023-10-13 09:51:51,099 epoch 2 - iter 60/152 - loss 0.18184307 - time (sec): 3.50 - samples/sec: 3436.64 - lr: 0.000048 - momentum: 0.000000 2023-10-13 09:51:51,945 epoch 2 - iter 75/152 - loss 0.17714467 - time (sec): 4.35 - samples/sec: 3426.02 - lr: 0.000047 - momentum: 0.000000 2023-10-13 09:51:52,788 epoch 2 - iter 90/152 - loss 0.16127747 - time (sec): 5.19 - samples/sec: 3446.04 - lr: 0.000047 - momentum: 0.000000 2023-10-13 09:51:53,679 epoch 2 - iter 105/152 - loss 0.16075853 - time (sec): 6.09 - samples/sec: 3486.78 - lr: 0.000046 - momentum: 0.000000 2023-10-13 09:51:54,527 epoch 2 - iter 120/152 - loss 0.15950001 - time (sec): 6.93 - samples/sec: 3516.08 - lr: 0.000046 - momentum: 0.000000 2023-10-13 09:51:55,323 epoch 2 - iter 135/152 - loss 0.15740598 - time (sec): 7.73 - samples/sec: 3539.82 - lr: 0.000045 - momentum: 0.000000 2023-10-13 09:51:56,134 epoch 2 - iter 150/152 - loss 0.15156996 - time (sec): 8.54 - samples/sec: 3582.22 - lr: 0.000045 - momentum: 0.000000 2023-10-13 09:51:56,242 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:56,242 EPOCH 2 done: loss 0.1518 - lr: 0.000045 2023-10-13 09:51:57,149 DEV : loss 0.14548559486865997 - f1-score (micro avg) 0.7703 2023-10-13 09:51:57,156 saving best model 2023-10-13 09:51:57,715 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:51:58,608 epoch 3 - iter 15/152 - loss 0.10742813 - time (sec): 0.89 - samples/sec: 3607.54 - lr: 0.000044 - momentum: 0.000000 2023-10-13 09:51:59,447 epoch 3 - iter 30/152 - loss 0.08788928 - time (sec): 1.73 - samples/sec: 3534.60 - lr: 0.000043 - momentum: 0.000000 2023-10-13 09:52:00,335 epoch 3 - iter 45/152 - loss 0.09093327 - time (sec): 2.61 - samples/sec: 3615.93 - lr: 0.000043 - momentum: 0.000000 2023-10-13 09:52:01,106 epoch 3 - iter 60/152 - loss 0.09143828 - time (sec): 3.39 - samples/sec: 3609.32 - lr: 0.000042 - momentum: 0.000000 2023-10-13 09:52:01,935 epoch 3 - iter 75/152 - loss 0.09183066 - time (sec): 4.21 - samples/sec: 3651.18 - lr: 0.000042 - momentum: 0.000000 2023-10-13 09:52:02,782 epoch 3 - iter 90/152 - loss 0.08816801 - time (sec): 5.06 - samples/sec: 3609.40 - lr: 0.000041 - momentum: 0.000000 2023-10-13 09:52:03,652 epoch 3 - iter 105/152 - loss 0.09007988 - time (sec): 5.93 - samples/sec: 3627.95 - lr: 0.000041 - momentum: 0.000000 2023-10-13 09:52:04,503 epoch 3 - iter 120/152 - loss 0.08841149 - time (sec): 6.78 - samples/sec: 3598.64 - lr: 0.000040 - momentum: 0.000000 2023-10-13 09:52:05,340 epoch 3 - iter 135/152 - loss 0.08550505 - time (sec): 7.62 - samples/sec: 3608.51 - lr: 0.000040 - momentum: 0.000000 2023-10-13 09:52:06,176 epoch 3 - iter 150/152 - loss 0.08379199 - time (sec): 8.46 - samples/sec: 3626.08 - lr: 0.000039 - momentum: 0.000000 2023-10-13 09:52:06,293 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:06,293 EPOCH 3 done: loss 0.0831 - lr: 0.000039 2023-10-13 09:52:07,214 DEV : loss 0.1430310755968094 - f1-score (micro avg) 0.8042 2023-10-13 09:52:07,221 saving best model 2023-10-13 09:52:07,743 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:08,638 epoch 4 - iter 15/152 - loss 0.07624932 - time (sec): 0.89 - samples/sec: 3488.56 - lr: 0.000038 - momentum: 0.000000 2023-10-13 09:52:09,480 epoch 4 - iter 30/152 - loss 0.05187210 - time (sec): 1.73 - samples/sec: 3635.33 - lr: 0.000038 - momentum: 0.000000 2023-10-13 09:52:10,258 epoch 4 - iter 45/152 - loss 0.04200932 - time (sec): 2.51 - samples/sec: 3599.10 - lr: 0.000037 - momentum: 0.000000 2023-10-13 09:52:11,075 epoch 4 - iter 60/152 - loss 0.05000254 - time (sec): 3.33 - samples/sec: 3608.24 - lr: 0.000037 - momentum: 0.000000 2023-10-13 09:52:11,918 epoch 4 - iter 75/152 - loss 0.04852576 - time (sec): 4.17 - samples/sec: 3609.22 - lr: 0.000036 - momentum: 0.000000 2023-10-13 09:52:12,814 epoch 4 - iter 90/152 - loss 0.04968924 - time (sec): 5.07 - samples/sec: 3585.12 - lr: 0.000036 - momentum: 0.000000 2023-10-13 09:52:13,700 epoch 4 - iter 105/152 - loss 0.04832129 - time (sec): 5.95 - samples/sec: 3609.00 - lr: 0.000035 - momentum: 0.000000 2023-10-13 09:52:14,516 epoch 4 - iter 120/152 - loss 0.04894922 - time (sec): 6.77 - samples/sec: 3629.39 - lr: 0.000035 - momentum: 0.000000 2023-10-13 09:52:15,344 epoch 4 - iter 135/152 - loss 0.05437976 - time (sec): 7.60 - samples/sec: 3653.10 - lr: 0.000034 - momentum: 0.000000 2023-10-13 09:52:16,148 epoch 4 - iter 150/152 - loss 0.05287770 - time (sec): 8.40 - samples/sec: 3648.96 - lr: 0.000034 - momentum: 0.000000 2023-10-13 09:52:16,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:16,252 EPOCH 4 done: loss 0.0541 - lr: 0.000034 2023-10-13 09:52:17,225 DEV : loss 0.1504412442445755 - f1-score (micro avg) 0.8386 2023-10-13 09:52:17,235 saving best model 2023-10-13 09:52:17,726 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:18,637 epoch 5 - iter 15/152 - loss 0.04451402 - time (sec): 0.91 - samples/sec: 3849.73 - lr: 0.000033 - momentum: 0.000000 2023-10-13 09:52:19,495 epoch 5 - iter 30/152 - loss 0.04068398 - time (sec): 1.77 - samples/sec: 3620.67 - lr: 0.000032 - momentum: 0.000000 2023-10-13 09:52:20,282 epoch 5 - iter 45/152 - loss 0.04339379 - time (sec): 2.55 - samples/sec: 3598.62 - lr: 0.000032 - momentum: 0.000000 2023-10-13 09:52:21,123 epoch 5 - iter 60/152 - loss 0.04373249 - time (sec): 3.39 - samples/sec: 3633.81 - lr: 0.000031 - momentum: 0.000000 2023-10-13 09:52:21,956 epoch 5 - iter 75/152 - loss 0.04033663 - time (sec): 4.23 - samples/sec: 3636.66 - lr: 0.000031 - momentum: 0.000000 2023-10-13 09:52:22,784 epoch 5 - iter 90/152 - loss 0.04520264 - time (sec): 5.05 - samples/sec: 3624.34 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:52:23,639 epoch 5 - iter 105/152 - loss 0.04599801 - time (sec): 5.91 - samples/sec: 3602.77 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:52:24,525 epoch 5 - iter 120/152 - loss 0.04196418 - time (sec): 6.79 - samples/sec: 3620.51 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:52:25,356 epoch 5 - iter 135/152 - loss 0.03948772 - time (sec): 7.63 - samples/sec: 3629.96 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:52:26,190 epoch 5 - iter 150/152 - loss 0.04054485 - time (sec): 8.46 - samples/sec: 3631.46 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:52:26,283 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:26,284 EPOCH 5 done: loss 0.0402 - lr: 0.000028 2023-10-13 09:52:27,306 DEV : loss 0.18044513463974 - f1-score (micro avg) 0.8347 2023-10-13 09:52:27,317 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:28,268 epoch 6 - iter 15/152 - loss 0.01811564 - time (sec): 0.95 - samples/sec: 3206.66 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:52:29,140 epoch 6 - iter 30/152 - loss 0.02304878 - time (sec): 1.82 - samples/sec: 3372.41 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:52:30,009 epoch 6 - iter 45/152 - loss 0.03078048 - time (sec): 2.69 - samples/sec: 3470.63 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:52:30,873 epoch 6 - iter 60/152 - loss 0.02574138 - time (sec): 3.55 - samples/sec: 3523.03 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:52:31,677 epoch 6 - iter 75/152 - loss 0.02561661 - time (sec): 4.36 - samples/sec: 3558.59 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:52:32,533 epoch 6 - iter 90/152 - loss 0.02891677 - time (sec): 5.21 - samples/sec: 3563.07 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:52:33,390 epoch 6 - iter 105/152 - loss 0.02798043 - time (sec): 6.07 - samples/sec: 3588.35 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:52:34,222 epoch 6 - iter 120/152 - loss 0.02783302 - time (sec): 6.90 - samples/sec: 3584.98 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:52:35,059 epoch 6 - iter 135/152 - loss 0.02995414 - time (sec): 7.74 - samples/sec: 3580.84 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:52:35,876 epoch 6 - iter 150/152 - loss 0.03238096 - time (sec): 8.56 - samples/sec: 3585.13 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:52:35,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:35,975 EPOCH 6 done: loss 0.0323 - lr: 0.000022 2023-10-13 09:52:36,896 DEV : loss 0.2050701230764389 - f1-score (micro avg) 0.8163 2023-10-13 09:52:36,903 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:37,722 epoch 7 - iter 15/152 - loss 0.00899740 - time (sec): 0.82 - samples/sec: 3771.47 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:52:38,550 epoch 7 - iter 30/152 - loss 0.00623636 - time (sec): 1.65 - samples/sec: 3841.91 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:52:39,396 epoch 7 - iter 45/152 - loss 0.01073920 - time (sec): 2.49 - samples/sec: 3713.19 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:52:40,193 epoch 7 - iter 60/152 - loss 0.01099004 - time (sec): 3.29 - samples/sec: 3719.72 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:52:41,063 epoch 7 - iter 75/152 - loss 0.01664050 - time (sec): 4.16 - samples/sec: 3648.55 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:52:41,901 epoch 7 - iter 90/152 - loss 0.01949001 - time (sec): 5.00 - samples/sec: 3641.92 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:52:42,712 epoch 7 - iter 105/152 - loss 0.01944975 - time (sec): 5.81 - samples/sec: 3633.08 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:52:43,579 epoch 7 - iter 120/152 - loss 0.01731977 - time (sec): 6.68 - samples/sec: 3614.05 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:52:44,488 epoch 7 - iter 135/152 - loss 0.02132045 - time (sec): 7.58 - samples/sec: 3655.77 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:52:45,321 epoch 7 - iter 150/152 - loss 0.02427676 - time (sec): 8.42 - samples/sec: 3652.95 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:52:45,416 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:45,416 EPOCH 7 done: loss 0.0241 - lr: 0.000017 2023-10-13 09:52:46,374 DEV : loss 0.2159377783536911 - f1-score (micro avg) 0.821 2023-10-13 09:52:46,380 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:47,198 epoch 8 - iter 15/152 - loss 0.00590400 - time (sec): 0.82 - samples/sec: 4007.68 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:52:48,035 epoch 8 - iter 30/152 - loss 0.01512275 - time (sec): 1.65 - samples/sec: 3910.81 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:52:48,876 epoch 8 - iter 45/152 - loss 0.01893903 - time (sec): 2.50 - samples/sec: 3743.07 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:52:49,757 epoch 8 - iter 60/152 - loss 0.01568164 - time (sec): 3.38 - samples/sec: 3717.97 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:52:50,596 epoch 8 - iter 75/152 - loss 0.01832374 - time (sec): 4.21 - samples/sec: 3693.15 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:52:51,434 epoch 8 - iter 90/152 - loss 0.01630764 - time (sec): 5.05 - samples/sec: 3725.66 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:52:52,288 epoch 8 - iter 105/152 - loss 0.01965564 - time (sec): 5.91 - samples/sec: 3682.36 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:52:53,106 epoch 8 - iter 120/152 - loss 0.01960239 - time (sec): 6.72 - samples/sec: 3661.88 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:52:54,000 epoch 8 - iter 135/152 - loss 0.01745112 - time (sec): 7.62 - samples/sec: 3664.83 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:52:54,789 epoch 8 - iter 150/152 - loss 0.01819874 - time (sec): 8.41 - samples/sec: 3656.17 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:52:54,885 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:54,886 EPOCH 8 done: loss 0.0181 - lr: 0.000011 2023-10-13 09:52:55,874 DEV : loss 0.20518945157527924 - f1-score (micro avg) 0.8349 2023-10-13 09:52:55,884 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:52:56,748 epoch 9 - iter 15/152 - loss 0.00219304 - time (sec): 0.86 - samples/sec: 3241.13 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:52:57,678 epoch 9 - iter 30/152 - loss 0.00425266 - time (sec): 1.79 - samples/sec: 3277.14 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:52:58,529 epoch 9 - iter 45/152 - loss 0.00600527 - time (sec): 2.64 - samples/sec: 3410.54 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:52:59,417 epoch 9 - iter 60/152 - loss 0.00924101 - time (sec): 3.53 - samples/sec: 3379.27 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:53:00,328 epoch 9 - iter 75/152 - loss 0.01390503 - time (sec): 4.44 - samples/sec: 3345.76 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:53:01,281 epoch 9 - iter 90/152 - loss 0.01250961 - time (sec): 5.40 - samples/sec: 3372.87 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:53:02,194 epoch 9 - iter 105/152 - loss 0.01378079 - time (sec): 6.31 - samples/sec: 3366.93 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:53:03,105 epoch 9 - iter 120/152 - loss 0.01531733 - time (sec): 7.22 - samples/sec: 3409.39 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:53:04,009 epoch 9 - iter 135/152 - loss 0.01472643 - time (sec): 8.12 - samples/sec: 3403.47 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:53:04,892 epoch 9 - iter 150/152 - loss 0.01426090 - time (sec): 9.01 - samples/sec: 3400.24 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:53:04,997 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:53:04,997 EPOCH 9 done: loss 0.0141 - lr: 0.000006 2023-10-13 09:53:06,300 DEV : loss 0.20314878225326538 - f1-score (micro avg) 0.8481 2023-10-13 09:53:06,309 saving best model 2023-10-13 09:53:06,804 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:53:07,752 epoch 10 - iter 15/152 - loss 0.00094629 - time (sec): 0.94 - samples/sec: 3276.31 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:53:08,656 epoch 10 - iter 30/152 - loss 0.01162894 - time (sec): 1.85 - samples/sec: 3297.57 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:53:09,600 epoch 10 - iter 45/152 - loss 0.01058284 - time (sec): 2.79 - samples/sec: 3422.07 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:53:10,479 epoch 10 - iter 60/152 - loss 0.00954044 - time (sec): 3.67 - samples/sec: 3489.69 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:53:11,296 epoch 10 - iter 75/152 - loss 0.00948215 - time (sec): 4.49 - samples/sec: 3488.85 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:53:12,159 epoch 10 - iter 90/152 - loss 0.00926473 - time (sec): 5.35 - samples/sec: 3477.94 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:53:13,028 epoch 10 - iter 105/152 - loss 0.01082039 - time (sec): 6.22 - samples/sec: 3513.70 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:53:13,869 epoch 10 - iter 120/152 - loss 0.01051382 - time (sec): 7.06 - samples/sec: 3472.22 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:53:14,726 epoch 10 - iter 135/152 - loss 0.01043064 - time (sec): 7.92 - samples/sec: 3488.26 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:53:15,582 epoch 10 - iter 150/152 - loss 0.01221355 - time (sec): 8.78 - samples/sec: 3481.25 - lr: 0.000000 - momentum: 0.000000 2023-10-13 09:53:15,705 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:53:15,706 EPOCH 10 done: loss 0.0120 - lr: 0.000000 2023-10-13 09:53:16,619 DEV : loss 0.20412695407867432 - f1-score (micro avg) 0.8491 2023-10-13 09:53:16,625 saving best model 2023-10-13 09:53:17,508 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:53:17,509 Loading model from best epoch ... 2023-10-13 09:53:19,173 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-13 09:53:19,909 Results: - F-score (micro) 0.7958 - F-score (macro) 0.6141 - Accuracy 0.6637 By class: precision recall f1-score support scope 0.7688 0.8146 0.7910 151 pers 0.7177 0.9271 0.8091 96 work 0.7227 0.9053 0.8037 95 loc 0.6667 0.6667 0.6667 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7389 0.8621 0.7958 348 macro avg 0.5752 0.6627 0.6141 348 weighted avg 0.7346 0.8621 0.7916 348 2023-10-13 09:53:19,909 ----------------------------------------------------------------------------------------------------