2023-10-17 19:55:17,598 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,599 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 19:55:17,599 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,599 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Train: 1085 sentences 2023-10-17 19:55:17,600 (train_with_dev=False, train_with_test=False) 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Training Params: 2023-10-17 19:55:17,600 - learning_rate: "5e-05" 2023-10-17 19:55:17,600 - mini_batch_size: "4" 2023-10-17 19:55:17,600 - max_epochs: "10" 2023-10-17 19:55:17,600 - shuffle: "True" 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Plugins: 2023-10-17 19:55:17,600 - TensorboardLogger 2023-10-17 19:55:17,600 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 19:55:17,600 - metric: "('micro avg', 'f1-score')" 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Computation: 2023-10-17 19:55:17,600 - compute on device: cuda:0 2023-10-17 19:55:17,600 - embedding storage: none 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:17,600 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 19:55:19,282 epoch 1 - iter 27/272 - loss 3.58728212 - time (sec): 1.68 - samples/sec: 2875.12 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:55:20,859 epoch 1 - iter 54/272 - loss 2.90987069 - time (sec): 3.26 - samples/sec: 2798.59 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:55:22,545 epoch 1 - iter 81/272 - loss 2.01439903 - time (sec): 4.94 - samples/sec: 3024.91 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:55:24,177 epoch 1 - iter 108/272 - loss 1.60141995 - time (sec): 6.58 - samples/sec: 3053.99 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:55:25,873 epoch 1 - iter 135/272 - loss 1.37197230 - time (sec): 8.27 - samples/sec: 2982.26 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:55:27,586 epoch 1 - iter 162/272 - loss 1.17795991 - time (sec): 9.98 - samples/sec: 3036.39 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:55:29,228 epoch 1 - iter 189/272 - loss 1.04761031 - time (sec): 11.63 - samples/sec: 3047.62 - lr: 0.000035 - momentum: 0.000000 2023-10-17 19:55:31,116 epoch 1 - iter 216/272 - loss 0.91564911 - time (sec): 13.52 - samples/sec: 3097.94 - lr: 0.000040 - momentum: 0.000000 2023-10-17 19:55:32,744 epoch 1 - iter 243/272 - loss 0.84955575 - time (sec): 15.14 - samples/sec: 3086.63 - lr: 0.000044 - momentum: 0.000000 2023-10-17 19:55:34,439 epoch 1 - iter 270/272 - loss 0.78512022 - time (sec): 16.84 - samples/sec: 3075.68 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:55:34,560 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:34,561 EPOCH 1 done: loss 0.7830 - lr: 0.000049 2023-10-17 19:55:35,793 DEV : loss 0.1515471190214157 - f1-score (micro avg) 0.6667 2023-10-17 19:55:35,798 saving best model 2023-10-17 19:55:36,227 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:37,920 epoch 2 - iter 27/272 - loss 0.12924177 - time (sec): 1.69 - samples/sec: 3053.54 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:55:39,505 epoch 2 - iter 54/272 - loss 0.14622270 - time (sec): 3.28 - samples/sec: 3212.70 - lr: 0.000049 - momentum: 0.000000 2023-10-17 19:55:41,201 epoch 2 - iter 81/272 - loss 0.15508578 - time (sec): 4.97 - samples/sec: 3232.69 - lr: 0.000048 - momentum: 0.000000 2023-10-17 19:55:42,823 epoch 2 - iter 108/272 - loss 0.15526673 - time (sec): 6.59 - samples/sec: 3233.37 - lr: 0.000048 - momentum: 0.000000 2023-10-17 19:55:44,296 epoch 2 - iter 135/272 - loss 0.14945571 - time (sec): 8.07 - samples/sec: 3198.57 - lr: 0.000047 - momentum: 0.000000 2023-10-17 19:55:45,967 epoch 2 - iter 162/272 - loss 0.15751951 - time (sec): 9.74 - samples/sec: 3202.41 - lr: 0.000047 - momentum: 0.000000 2023-10-17 19:55:47,469 epoch 2 - iter 189/272 - loss 0.15270059 - time (sec): 11.24 - samples/sec: 3171.02 - lr: 0.000046 - momentum: 0.000000 2023-10-17 19:55:49,098 epoch 2 - iter 216/272 - loss 0.14321390 - time (sec): 12.87 - samples/sec: 3212.17 - lr: 0.000046 - momentum: 0.000000 2023-10-17 19:55:50,812 epoch 2 - iter 243/272 - loss 0.13815810 - time (sec): 14.58 - samples/sec: 3193.14 - lr: 0.000045 - momentum: 0.000000 2023-10-17 19:55:52,345 epoch 2 - iter 270/272 - loss 0.13549913 - time (sec): 16.12 - samples/sec: 3208.84 - lr: 0.000045 - momentum: 0.000000 2023-10-17 19:55:52,441 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:52,441 EPOCH 2 done: loss 0.1351 - lr: 0.000045 2023-10-17 19:55:53,883 DEV : loss 0.10580824315547943 - f1-score (micro avg) 0.7726 2023-10-17 19:55:53,888 saving best model 2023-10-17 19:55:54,391 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:55:56,140 epoch 3 - iter 27/272 - loss 0.07036557 - time (sec): 1.74 - samples/sec: 3130.71 - lr: 0.000044 - momentum: 0.000000 2023-10-17 19:55:57,793 epoch 3 - iter 54/272 - loss 0.07600464 - time (sec): 3.40 - samples/sec: 3299.42 - lr: 0.000043 - momentum: 0.000000 2023-10-17 19:55:59,344 epoch 3 - iter 81/272 - loss 0.07559162 - time (sec): 4.95 - samples/sec: 3335.05 - lr: 0.000043 - momentum: 0.000000 2023-10-17 19:56:00,940 epoch 3 - iter 108/272 - loss 0.08826217 - time (sec): 6.55 - samples/sec: 3351.87 - lr: 0.000042 - momentum: 0.000000 2023-10-17 19:56:02,554 epoch 3 - iter 135/272 - loss 0.08009279 - time (sec): 8.16 - samples/sec: 3341.18 - lr: 0.000042 - momentum: 0.000000 2023-10-17 19:56:04,082 epoch 3 - iter 162/272 - loss 0.08296104 - time (sec): 9.69 - samples/sec: 3313.53 - lr: 0.000041 - momentum: 0.000000 2023-10-17 19:56:05,656 epoch 3 - iter 189/272 - loss 0.08693437 - time (sec): 11.26 - samples/sec: 3302.14 - lr: 0.000041 - momentum: 0.000000 2023-10-17 19:56:07,084 epoch 3 - iter 216/272 - loss 0.08846207 - time (sec): 12.69 - samples/sec: 3275.61 - lr: 0.000040 - momentum: 0.000000 2023-10-17 19:56:08,705 epoch 3 - iter 243/272 - loss 0.08394891 - time (sec): 14.31 - samples/sec: 3290.12 - lr: 0.000040 - momentum: 0.000000 2023-10-17 19:56:10,183 epoch 3 - iter 270/272 - loss 0.08317210 - time (sec): 15.79 - samples/sec: 3282.51 - lr: 0.000039 - momentum: 0.000000 2023-10-17 19:56:10,276 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:10,277 EPOCH 3 done: loss 0.0829 - lr: 0.000039 2023-10-17 19:56:11,707 DEV : loss 0.10402542352676392 - f1-score (micro avg) 0.8043 2023-10-17 19:56:11,712 saving best model 2023-10-17 19:56:12,213 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:13,938 epoch 4 - iter 27/272 - loss 0.03477530 - time (sec): 1.72 - samples/sec: 2832.62 - lr: 0.000038 - momentum: 0.000000 2023-10-17 19:56:15,671 epoch 4 - iter 54/272 - loss 0.03856732 - time (sec): 3.45 - samples/sec: 2877.04 - lr: 0.000038 - momentum: 0.000000 2023-10-17 19:56:17,462 epoch 4 - iter 81/272 - loss 0.04308053 - time (sec): 5.25 - samples/sec: 3020.32 - lr: 0.000037 - momentum: 0.000000 2023-10-17 19:56:18,885 epoch 4 - iter 108/272 - loss 0.04002880 - time (sec): 6.67 - samples/sec: 3025.83 - lr: 0.000037 - momentum: 0.000000 2023-10-17 19:56:20,446 epoch 4 - iter 135/272 - loss 0.04038678 - time (sec): 8.23 - samples/sec: 3063.84 - lr: 0.000036 - momentum: 0.000000 2023-10-17 19:56:22,343 epoch 4 - iter 162/272 - loss 0.04511288 - time (sec): 10.13 - samples/sec: 3062.76 - lr: 0.000036 - momentum: 0.000000 2023-10-17 19:56:23,947 epoch 4 - iter 189/272 - loss 0.04671298 - time (sec): 11.73 - samples/sec: 3072.60 - lr: 0.000035 - momentum: 0.000000 2023-10-17 19:56:25,610 epoch 4 - iter 216/272 - loss 0.04939325 - time (sec): 13.39 - samples/sec: 3080.46 - lr: 0.000034 - momentum: 0.000000 2023-10-17 19:56:27,178 epoch 4 - iter 243/272 - loss 0.05098514 - time (sec): 14.96 - samples/sec: 3100.74 - lr: 0.000034 - momentum: 0.000000 2023-10-17 19:56:28,751 epoch 4 - iter 270/272 - loss 0.05096145 - time (sec): 16.53 - samples/sec: 3129.77 - lr: 0.000033 - momentum: 0.000000 2023-10-17 19:56:28,851 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:28,852 EPOCH 4 done: loss 0.0511 - lr: 0.000033 2023-10-17 19:56:30,341 DEV : loss 0.12085414677858353 - f1-score (micro avg) 0.8052 2023-10-17 19:56:30,347 saving best model 2023-10-17 19:56:30,865 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:32,517 epoch 5 - iter 27/272 - loss 0.02111467 - time (sec): 1.65 - samples/sec: 3295.06 - lr: 0.000033 - momentum: 0.000000 2023-10-17 19:56:34,213 epoch 5 - iter 54/272 - loss 0.02884558 - time (sec): 3.34 - samples/sec: 3241.32 - lr: 0.000032 - momentum: 0.000000 2023-10-17 19:56:35,887 epoch 5 - iter 81/272 - loss 0.03350373 - time (sec): 5.02 - samples/sec: 3199.38 - lr: 0.000032 - momentum: 0.000000 2023-10-17 19:56:37,661 epoch 5 - iter 108/272 - loss 0.03765684 - time (sec): 6.79 - samples/sec: 3099.04 - lr: 0.000031 - momentum: 0.000000 2023-10-17 19:56:39,470 epoch 5 - iter 135/272 - loss 0.03526537 - time (sec): 8.60 - samples/sec: 3039.27 - lr: 0.000031 - momentum: 0.000000 2023-10-17 19:56:41,155 epoch 5 - iter 162/272 - loss 0.03255842 - time (sec): 10.29 - samples/sec: 3040.16 - lr: 0.000030 - momentum: 0.000000 2023-10-17 19:56:42,833 epoch 5 - iter 189/272 - loss 0.03591230 - time (sec): 11.96 - samples/sec: 3028.25 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:56:44,469 epoch 5 - iter 216/272 - loss 0.03429439 - time (sec): 13.60 - samples/sec: 3062.62 - lr: 0.000029 - momentum: 0.000000 2023-10-17 19:56:46,013 epoch 5 - iter 243/272 - loss 0.03432142 - time (sec): 15.15 - samples/sec: 3056.50 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:56:47,629 epoch 5 - iter 270/272 - loss 0.03306242 - time (sec): 16.76 - samples/sec: 3093.88 - lr: 0.000028 - momentum: 0.000000 2023-10-17 19:56:47,714 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:47,714 EPOCH 5 done: loss 0.0331 - lr: 0.000028 2023-10-17 19:56:49,198 DEV : loss 0.16374681890010834 - f1-score (micro avg) 0.7731 2023-10-17 19:56:49,205 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:56:50,796 epoch 6 - iter 27/272 - loss 0.01795708 - time (sec): 1.59 - samples/sec: 2942.56 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:56:52,589 epoch 6 - iter 54/272 - loss 0.02307386 - time (sec): 3.38 - samples/sec: 2914.30 - lr: 0.000027 - momentum: 0.000000 2023-10-17 19:56:54,348 epoch 6 - iter 81/272 - loss 0.02335478 - time (sec): 5.14 - samples/sec: 2954.02 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:56:56,062 epoch 6 - iter 108/272 - loss 0.02168575 - time (sec): 6.85 - samples/sec: 2958.19 - lr: 0.000026 - momentum: 0.000000 2023-10-17 19:56:57,770 epoch 6 - iter 135/272 - loss 0.02071478 - time (sec): 8.56 - samples/sec: 3025.61 - lr: 0.000025 - momentum: 0.000000 2023-10-17 19:56:59,511 epoch 6 - iter 162/272 - loss 0.02139419 - time (sec): 10.30 - samples/sec: 3098.72 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:57:01,179 epoch 6 - iter 189/272 - loss 0.02154034 - time (sec): 11.97 - samples/sec: 3090.78 - lr: 0.000024 - momentum: 0.000000 2023-10-17 19:57:02,732 epoch 6 - iter 216/272 - loss 0.02158152 - time (sec): 13.52 - samples/sec: 3084.52 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:57:04,267 epoch 6 - iter 243/272 - loss 0.02043382 - time (sec): 15.06 - samples/sec: 3093.99 - lr: 0.000023 - momentum: 0.000000 2023-10-17 19:57:05,846 epoch 6 - iter 270/272 - loss 0.02340166 - time (sec): 16.64 - samples/sec: 3114.02 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:57:05,945 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:05,946 EPOCH 6 done: loss 0.0233 - lr: 0.000022 2023-10-17 19:57:07,395 DEV : loss 0.18149712681770325 - f1-score (micro avg) 0.7712 2023-10-17 19:57:07,400 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:09,258 epoch 7 - iter 27/272 - loss 0.01987005 - time (sec): 1.86 - samples/sec: 3361.98 - lr: 0.000022 - momentum: 0.000000 2023-10-17 19:57:10,786 epoch 7 - iter 54/272 - loss 0.02048753 - time (sec): 3.38 - samples/sec: 3357.29 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:57:12,194 epoch 7 - iter 81/272 - loss 0.01675398 - time (sec): 4.79 - samples/sec: 3296.25 - lr: 0.000021 - momentum: 0.000000 2023-10-17 19:57:13,715 epoch 7 - iter 108/272 - loss 0.01410873 - time (sec): 6.31 - samples/sec: 3195.56 - lr: 0.000020 - momentum: 0.000000 2023-10-17 19:57:15,339 epoch 7 - iter 135/272 - loss 0.01440026 - time (sec): 7.94 - samples/sec: 3167.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:57:16,937 epoch 7 - iter 162/272 - loss 0.01524027 - time (sec): 9.54 - samples/sec: 3181.29 - lr: 0.000019 - momentum: 0.000000 2023-10-17 19:57:18,475 epoch 7 - iter 189/272 - loss 0.01363766 - time (sec): 11.07 - samples/sec: 3223.90 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:57:20,051 epoch 7 - iter 216/272 - loss 0.01279916 - time (sec): 12.65 - samples/sec: 3247.31 - lr: 0.000018 - momentum: 0.000000 2023-10-17 19:57:21,757 epoch 7 - iter 243/272 - loss 0.01479402 - time (sec): 14.36 - samples/sec: 3231.58 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:57:23,391 epoch 7 - iter 270/272 - loss 0.01589460 - time (sec): 15.99 - samples/sec: 3231.65 - lr: 0.000017 - momentum: 0.000000 2023-10-17 19:57:23,498 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:23,498 EPOCH 7 done: loss 0.0158 - lr: 0.000017 2023-10-17 19:57:24,944 DEV : loss 0.16515286266803741 - f1-score (micro avg) 0.814 2023-10-17 19:57:24,951 saving best model 2023-10-17 19:57:25,573 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:27,265 epoch 8 - iter 27/272 - loss 0.00723031 - time (sec): 1.69 - samples/sec: 2942.52 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:57:29,114 epoch 8 - iter 54/272 - loss 0.01039951 - time (sec): 3.54 - samples/sec: 3105.30 - lr: 0.000016 - momentum: 0.000000 2023-10-17 19:57:30,841 epoch 8 - iter 81/272 - loss 0.01020122 - time (sec): 5.26 - samples/sec: 3104.25 - lr: 0.000015 - momentum: 0.000000 2023-10-17 19:57:32,550 epoch 8 - iter 108/272 - loss 0.01100986 - time (sec): 6.97 - samples/sec: 2990.26 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:57:34,626 epoch 8 - iter 135/272 - loss 0.01325530 - time (sec): 9.05 - samples/sec: 2913.62 - lr: 0.000014 - momentum: 0.000000 2023-10-17 19:57:36,295 epoch 8 - iter 162/272 - loss 0.01235273 - time (sec): 10.72 - samples/sec: 2947.60 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:57:38,102 epoch 8 - iter 189/272 - loss 0.01435970 - time (sec): 12.53 - samples/sec: 3020.99 - lr: 0.000013 - momentum: 0.000000 2023-10-17 19:57:39,541 epoch 8 - iter 216/272 - loss 0.01417857 - time (sec): 13.96 - samples/sec: 3006.09 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:57:41,022 epoch 8 - iter 243/272 - loss 0.01414234 - time (sec): 15.45 - samples/sec: 2994.35 - lr: 0.000012 - momentum: 0.000000 2023-10-17 19:57:42,698 epoch 8 - iter 270/272 - loss 0.01356212 - time (sec): 17.12 - samples/sec: 3023.14 - lr: 0.000011 - momentum: 0.000000 2023-10-17 19:57:42,788 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:42,789 EPOCH 8 done: loss 0.0136 - lr: 0.000011 2023-10-17 19:57:44,251 DEV : loss 0.17592966556549072 - f1-score (micro avg) 0.8222 2023-10-17 19:57:44,256 saving best model 2023-10-17 19:57:44,791 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:57:46,426 epoch 9 - iter 27/272 - loss 0.00538360 - time (sec): 1.63 - samples/sec: 2942.35 - lr: 0.000011 - momentum: 0.000000 2023-10-17 19:57:48,183 epoch 9 - iter 54/272 - loss 0.00340815 - time (sec): 3.39 - samples/sec: 2946.64 - lr: 0.000010 - momentum: 0.000000 2023-10-17 19:57:49,700 epoch 9 - iter 81/272 - loss 0.00343662 - time (sec): 4.91 - samples/sec: 2873.13 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:57:51,553 epoch 9 - iter 108/272 - loss 0.00716874 - time (sec): 6.76 - samples/sec: 2978.89 - lr: 0.000009 - momentum: 0.000000 2023-10-17 19:57:53,125 epoch 9 - iter 135/272 - loss 0.00825452 - time (sec): 8.33 - samples/sec: 2975.95 - lr: 0.000008 - momentum: 0.000000 2023-10-17 19:57:54,779 epoch 9 - iter 162/272 - loss 0.00786866 - time (sec): 9.98 - samples/sec: 2975.76 - lr: 0.000008 - momentum: 0.000000 2023-10-17 19:57:56,585 epoch 9 - iter 189/272 - loss 0.00753915 - time (sec): 11.79 - samples/sec: 3091.69 - lr: 0.000007 - momentum: 0.000000 2023-10-17 19:57:58,248 epoch 9 - iter 216/272 - loss 0.00811762 - time (sec): 13.45 - samples/sec: 3103.59 - lr: 0.000007 - momentum: 0.000000 2023-10-17 19:57:59,862 epoch 9 - iter 243/272 - loss 0.00912190 - time (sec): 15.07 - samples/sec: 3068.80 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:58:01,590 epoch 9 - iter 270/272 - loss 0.00837205 - time (sec): 16.79 - samples/sec: 3084.14 - lr: 0.000006 - momentum: 0.000000 2023-10-17 19:58:01,677 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:01,678 EPOCH 9 done: loss 0.0083 - lr: 0.000006 2023-10-17 19:58:03,185 DEV : loss 0.18768392503261566 - f1-score (micro avg) 0.7963 2023-10-17 19:58:03,191 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:04,833 epoch 10 - iter 27/272 - loss 0.00264070 - time (sec): 1.64 - samples/sec: 3087.66 - lr: 0.000005 - momentum: 0.000000 2023-10-17 19:58:06,348 epoch 10 - iter 54/272 - loss 0.00313848 - time (sec): 3.16 - samples/sec: 3101.26 - lr: 0.000004 - momentum: 0.000000 2023-10-17 19:58:07,852 epoch 10 - iter 81/272 - loss 0.00255139 - time (sec): 4.66 - samples/sec: 3117.78 - lr: 0.000004 - momentum: 0.000000 2023-10-17 19:58:09,420 epoch 10 - iter 108/272 - loss 0.00238475 - time (sec): 6.23 - samples/sec: 3211.81 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:58:11,094 epoch 10 - iter 135/272 - loss 0.00263451 - time (sec): 7.90 - samples/sec: 3256.43 - lr: 0.000003 - momentum: 0.000000 2023-10-17 19:58:12,882 epoch 10 - iter 162/272 - loss 0.00305484 - time (sec): 9.69 - samples/sec: 3260.03 - lr: 0.000002 - momentum: 0.000000 2023-10-17 19:58:14,418 epoch 10 - iter 189/272 - loss 0.00319056 - time (sec): 11.23 - samples/sec: 3223.06 - lr: 0.000002 - momentum: 0.000000 2023-10-17 19:58:16,162 epoch 10 - iter 216/272 - loss 0.00368433 - time (sec): 12.97 - samples/sec: 3182.47 - lr: 0.000001 - momentum: 0.000000 2023-10-17 19:58:18,016 epoch 10 - iter 243/272 - loss 0.00572804 - time (sec): 14.82 - samples/sec: 3165.87 - lr: 0.000001 - momentum: 0.000000 2023-10-17 19:58:19,571 epoch 10 - iter 270/272 - loss 0.00528442 - time (sec): 16.38 - samples/sec: 3162.27 - lr: 0.000000 - momentum: 0.000000 2023-10-17 19:58:19,667 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:19,667 EPOCH 10 done: loss 0.0054 - lr: 0.000000 2023-10-17 19:58:21,160 DEV : loss 0.18381302058696747 - f1-score (micro avg) 0.8037 2023-10-17 19:58:21,585 ---------------------------------------------------------------------------------------------------- 2023-10-17 19:58:21,586 Loading model from best epoch ... 2023-10-17 19:58:23,333 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 19:58:25,658 Results: - F-score (micro) 0.8069 - F-score (macro) 0.7678 - Accuracy 0.6937 By class: precision recall f1-score support LOC 0.8140 0.8974 0.8537 312 PER 0.7258 0.8654 0.7895 208 ORG 0.5818 0.5818 0.5818 55 HumanProd 0.7333 1.0000 0.8462 22 micro avg 0.7592 0.8610 0.8069 597 macro avg 0.7137 0.8362 0.7678 597 weighted avg 0.7589 0.8610 0.8060 597 2023-10-17 19:58:25,658 ----------------------------------------------------------------------------------------------------