stefan-it's picture
Upload folder using huggingface_hub
0d002d5
2023-10-17 19:55:17,598 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,599 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 19:55:17,599 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,599 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Train: 1085 sentences
2023-10-17 19:55:17,600 (train_with_dev=False, train_with_test=False)
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Training Params:
2023-10-17 19:55:17,600 - learning_rate: "5e-05"
2023-10-17 19:55:17,600 - mini_batch_size: "4"
2023-10-17 19:55:17,600 - max_epochs: "10"
2023-10-17 19:55:17,600 - shuffle: "True"
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Plugins:
2023-10-17 19:55:17,600 - TensorboardLogger
2023-10-17 19:55:17,600 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 19:55:17,600 - metric: "('micro avg', 'f1-score')"
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Computation:
2023-10-17 19:55:17,600 - compute on device: cuda:0
2023-10-17 19:55:17,600 - embedding storage: none
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:17,600 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 19:55:19,282 epoch 1 - iter 27/272 - loss 3.58728212 - time (sec): 1.68 - samples/sec: 2875.12 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:55:20,859 epoch 1 - iter 54/272 - loss 2.90987069 - time (sec): 3.26 - samples/sec: 2798.59 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:55:22,545 epoch 1 - iter 81/272 - loss 2.01439903 - time (sec): 4.94 - samples/sec: 3024.91 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:55:24,177 epoch 1 - iter 108/272 - loss 1.60141995 - time (sec): 6.58 - samples/sec: 3053.99 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:55:25,873 epoch 1 - iter 135/272 - loss 1.37197230 - time (sec): 8.27 - samples/sec: 2982.26 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:55:27,586 epoch 1 - iter 162/272 - loss 1.17795991 - time (sec): 9.98 - samples/sec: 3036.39 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:55:29,228 epoch 1 - iter 189/272 - loss 1.04761031 - time (sec): 11.63 - samples/sec: 3047.62 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:55:31,116 epoch 1 - iter 216/272 - loss 0.91564911 - time (sec): 13.52 - samples/sec: 3097.94 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:55:32,744 epoch 1 - iter 243/272 - loss 0.84955575 - time (sec): 15.14 - samples/sec: 3086.63 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:55:34,439 epoch 1 - iter 270/272 - loss 0.78512022 - time (sec): 16.84 - samples/sec: 3075.68 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:55:34,560 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:34,561 EPOCH 1 done: loss 0.7830 - lr: 0.000049
2023-10-17 19:55:35,793 DEV : loss 0.1515471190214157 - f1-score (micro avg) 0.6667
2023-10-17 19:55:35,798 saving best model
2023-10-17 19:55:36,227 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:37,920 epoch 2 - iter 27/272 - loss 0.12924177 - time (sec): 1.69 - samples/sec: 3053.54 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:55:39,505 epoch 2 - iter 54/272 - loss 0.14622270 - time (sec): 3.28 - samples/sec: 3212.70 - lr: 0.000049 - momentum: 0.000000
2023-10-17 19:55:41,201 epoch 2 - iter 81/272 - loss 0.15508578 - time (sec): 4.97 - samples/sec: 3232.69 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:55:42,823 epoch 2 - iter 108/272 - loss 0.15526673 - time (sec): 6.59 - samples/sec: 3233.37 - lr: 0.000048 - momentum: 0.000000
2023-10-17 19:55:44,296 epoch 2 - iter 135/272 - loss 0.14945571 - time (sec): 8.07 - samples/sec: 3198.57 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:55:45,967 epoch 2 - iter 162/272 - loss 0.15751951 - time (sec): 9.74 - samples/sec: 3202.41 - lr: 0.000047 - momentum: 0.000000
2023-10-17 19:55:47,469 epoch 2 - iter 189/272 - loss 0.15270059 - time (sec): 11.24 - samples/sec: 3171.02 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:55:49,098 epoch 2 - iter 216/272 - loss 0.14321390 - time (sec): 12.87 - samples/sec: 3212.17 - lr: 0.000046 - momentum: 0.000000
2023-10-17 19:55:50,812 epoch 2 - iter 243/272 - loss 0.13815810 - time (sec): 14.58 - samples/sec: 3193.14 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:55:52,345 epoch 2 - iter 270/272 - loss 0.13549913 - time (sec): 16.12 - samples/sec: 3208.84 - lr: 0.000045 - momentum: 0.000000
2023-10-17 19:55:52,441 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:52,441 EPOCH 2 done: loss 0.1351 - lr: 0.000045
2023-10-17 19:55:53,883 DEV : loss 0.10580824315547943 - f1-score (micro avg) 0.7726
2023-10-17 19:55:53,888 saving best model
2023-10-17 19:55:54,391 ----------------------------------------------------------------------------------------------------
2023-10-17 19:55:56,140 epoch 3 - iter 27/272 - loss 0.07036557 - time (sec): 1.74 - samples/sec: 3130.71 - lr: 0.000044 - momentum: 0.000000
2023-10-17 19:55:57,793 epoch 3 - iter 54/272 - loss 0.07600464 - time (sec): 3.40 - samples/sec: 3299.42 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:55:59,344 epoch 3 - iter 81/272 - loss 0.07559162 - time (sec): 4.95 - samples/sec: 3335.05 - lr: 0.000043 - momentum: 0.000000
2023-10-17 19:56:00,940 epoch 3 - iter 108/272 - loss 0.08826217 - time (sec): 6.55 - samples/sec: 3351.87 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:56:02,554 epoch 3 - iter 135/272 - loss 0.08009279 - time (sec): 8.16 - samples/sec: 3341.18 - lr: 0.000042 - momentum: 0.000000
2023-10-17 19:56:04,082 epoch 3 - iter 162/272 - loss 0.08296104 - time (sec): 9.69 - samples/sec: 3313.53 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:56:05,656 epoch 3 - iter 189/272 - loss 0.08693437 - time (sec): 11.26 - samples/sec: 3302.14 - lr: 0.000041 - momentum: 0.000000
2023-10-17 19:56:07,084 epoch 3 - iter 216/272 - loss 0.08846207 - time (sec): 12.69 - samples/sec: 3275.61 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:56:08,705 epoch 3 - iter 243/272 - loss 0.08394891 - time (sec): 14.31 - samples/sec: 3290.12 - lr: 0.000040 - momentum: 0.000000
2023-10-17 19:56:10,183 epoch 3 - iter 270/272 - loss 0.08317210 - time (sec): 15.79 - samples/sec: 3282.51 - lr: 0.000039 - momentum: 0.000000
2023-10-17 19:56:10,276 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:10,277 EPOCH 3 done: loss 0.0829 - lr: 0.000039
2023-10-17 19:56:11,707 DEV : loss 0.10402542352676392 - f1-score (micro avg) 0.8043
2023-10-17 19:56:11,712 saving best model
2023-10-17 19:56:12,213 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:13,938 epoch 4 - iter 27/272 - loss 0.03477530 - time (sec): 1.72 - samples/sec: 2832.62 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:56:15,671 epoch 4 - iter 54/272 - loss 0.03856732 - time (sec): 3.45 - samples/sec: 2877.04 - lr: 0.000038 - momentum: 0.000000
2023-10-17 19:56:17,462 epoch 4 - iter 81/272 - loss 0.04308053 - time (sec): 5.25 - samples/sec: 3020.32 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:56:18,885 epoch 4 - iter 108/272 - loss 0.04002880 - time (sec): 6.67 - samples/sec: 3025.83 - lr: 0.000037 - momentum: 0.000000
2023-10-17 19:56:20,446 epoch 4 - iter 135/272 - loss 0.04038678 - time (sec): 8.23 - samples/sec: 3063.84 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:56:22,343 epoch 4 - iter 162/272 - loss 0.04511288 - time (sec): 10.13 - samples/sec: 3062.76 - lr: 0.000036 - momentum: 0.000000
2023-10-17 19:56:23,947 epoch 4 - iter 189/272 - loss 0.04671298 - time (sec): 11.73 - samples/sec: 3072.60 - lr: 0.000035 - momentum: 0.000000
2023-10-17 19:56:25,610 epoch 4 - iter 216/272 - loss 0.04939325 - time (sec): 13.39 - samples/sec: 3080.46 - lr: 0.000034 - momentum: 0.000000
2023-10-17 19:56:27,178 epoch 4 - iter 243/272 - loss 0.05098514 - time (sec): 14.96 - samples/sec: 3100.74 - lr: 0.000034 - momentum: 0.000000
2023-10-17 19:56:28,751 epoch 4 - iter 270/272 - loss 0.05096145 - time (sec): 16.53 - samples/sec: 3129.77 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:56:28,851 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:28,852 EPOCH 4 done: loss 0.0511 - lr: 0.000033
2023-10-17 19:56:30,341 DEV : loss 0.12085414677858353 - f1-score (micro avg) 0.8052
2023-10-17 19:56:30,347 saving best model
2023-10-17 19:56:30,865 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:32,517 epoch 5 - iter 27/272 - loss 0.02111467 - time (sec): 1.65 - samples/sec: 3295.06 - lr: 0.000033 - momentum: 0.000000
2023-10-17 19:56:34,213 epoch 5 - iter 54/272 - loss 0.02884558 - time (sec): 3.34 - samples/sec: 3241.32 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:56:35,887 epoch 5 - iter 81/272 - loss 0.03350373 - time (sec): 5.02 - samples/sec: 3199.38 - lr: 0.000032 - momentum: 0.000000
2023-10-17 19:56:37,661 epoch 5 - iter 108/272 - loss 0.03765684 - time (sec): 6.79 - samples/sec: 3099.04 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:56:39,470 epoch 5 - iter 135/272 - loss 0.03526537 - time (sec): 8.60 - samples/sec: 3039.27 - lr: 0.000031 - momentum: 0.000000
2023-10-17 19:56:41,155 epoch 5 - iter 162/272 - loss 0.03255842 - time (sec): 10.29 - samples/sec: 3040.16 - lr: 0.000030 - momentum: 0.000000
2023-10-17 19:56:42,833 epoch 5 - iter 189/272 - loss 0.03591230 - time (sec): 11.96 - samples/sec: 3028.25 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:56:44,469 epoch 5 - iter 216/272 - loss 0.03429439 - time (sec): 13.60 - samples/sec: 3062.62 - lr: 0.000029 - momentum: 0.000000
2023-10-17 19:56:46,013 epoch 5 - iter 243/272 - loss 0.03432142 - time (sec): 15.15 - samples/sec: 3056.50 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:56:47,629 epoch 5 - iter 270/272 - loss 0.03306242 - time (sec): 16.76 - samples/sec: 3093.88 - lr: 0.000028 - momentum: 0.000000
2023-10-17 19:56:47,714 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:47,714 EPOCH 5 done: loss 0.0331 - lr: 0.000028
2023-10-17 19:56:49,198 DEV : loss 0.16374681890010834 - f1-score (micro avg) 0.7731
2023-10-17 19:56:49,205 ----------------------------------------------------------------------------------------------------
2023-10-17 19:56:50,796 epoch 6 - iter 27/272 - loss 0.01795708 - time (sec): 1.59 - samples/sec: 2942.56 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:56:52,589 epoch 6 - iter 54/272 - loss 0.02307386 - time (sec): 3.38 - samples/sec: 2914.30 - lr: 0.000027 - momentum: 0.000000
2023-10-17 19:56:54,348 epoch 6 - iter 81/272 - loss 0.02335478 - time (sec): 5.14 - samples/sec: 2954.02 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:56:56,062 epoch 6 - iter 108/272 - loss 0.02168575 - time (sec): 6.85 - samples/sec: 2958.19 - lr: 0.000026 - momentum: 0.000000
2023-10-17 19:56:57,770 epoch 6 - iter 135/272 - loss 0.02071478 - time (sec): 8.56 - samples/sec: 3025.61 - lr: 0.000025 - momentum: 0.000000
2023-10-17 19:56:59,511 epoch 6 - iter 162/272 - loss 0.02139419 - time (sec): 10.30 - samples/sec: 3098.72 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:57:01,179 epoch 6 - iter 189/272 - loss 0.02154034 - time (sec): 11.97 - samples/sec: 3090.78 - lr: 0.000024 - momentum: 0.000000
2023-10-17 19:57:02,732 epoch 6 - iter 216/272 - loss 0.02158152 - time (sec): 13.52 - samples/sec: 3084.52 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:57:04,267 epoch 6 - iter 243/272 - loss 0.02043382 - time (sec): 15.06 - samples/sec: 3093.99 - lr: 0.000023 - momentum: 0.000000
2023-10-17 19:57:05,846 epoch 6 - iter 270/272 - loss 0.02340166 - time (sec): 16.64 - samples/sec: 3114.02 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:57:05,945 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:05,946 EPOCH 6 done: loss 0.0233 - lr: 0.000022
2023-10-17 19:57:07,395 DEV : loss 0.18149712681770325 - f1-score (micro avg) 0.7712
2023-10-17 19:57:07,400 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:09,258 epoch 7 - iter 27/272 - loss 0.01987005 - time (sec): 1.86 - samples/sec: 3361.98 - lr: 0.000022 - momentum: 0.000000
2023-10-17 19:57:10,786 epoch 7 - iter 54/272 - loss 0.02048753 - time (sec): 3.38 - samples/sec: 3357.29 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:57:12,194 epoch 7 - iter 81/272 - loss 0.01675398 - time (sec): 4.79 - samples/sec: 3296.25 - lr: 0.000021 - momentum: 0.000000
2023-10-17 19:57:13,715 epoch 7 - iter 108/272 - loss 0.01410873 - time (sec): 6.31 - samples/sec: 3195.56 - lr: 0.000020 - momentum: 0.000000
2023-10-17 19:57:15,339 epoch 7 - iter 135/272 - loss 0.01440026 - time (sec): 7.94 - samples/sec: 3167.80 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:57:16,937 epoch 7 - iter 162/272 - loss 0.01524027 - time (sec): 9.54 - samples/sec: 3181.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 19:57:18,475 epoch 7 - iter 189/272 - loss 0.01363766 - time (sec): 11.07 - samples/sec: 3223.90 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:57:20,051 epoch 7 - iter 216/272 - loss 0.01279916 - time (sec): 12.65 - samples/sec: 3247.31 - lr: 0.000018 - momentum: 0.000000
2023-10-17 19:57:21,757 epoch 7 - iter 243/272 - loss 0.01479402 - time (sec): 14.36 - samples/sec: 3231.58 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:57:23,391 epoch 7 - iter 270/272 - loss 0.01589460 - time (sec): 15.99 - samples/sec: 3231.65 - lr: 0.000017 - momentum: 0.000000
2023-10-17 19:57:23,498 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:23,498 EPOCH 7 done: loss 0.0158 - lr: 0.000017
2023-10-17 19:57:24,944 DEV : loss 0.16515286266803741 - f1-score (micro avg) 0.814
2023-10-17 19:57:24,951 saving best model
2023-10-17 19:57:25,573 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:27,265 epoch 8 - iter 27/272 - loss 0.00723031 - time (sec): 1.69 - samples/sec: 2942.52 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:57:29,114 epoch 8 - iter 54/272 - loss 0.01039951 - time (sec): 3.54 - samples/sec: 3105.30 - lr: 0.000016 - momentum: 0.000000
2023-10-17 19:57:30,841 epoch 8 - iter 81/272 - loss 0.01020122 - time (sec): 5.26 - samples/sec: 3104.25 - lr: 0.000015 - momentum: 0.000000
2023-10-17 19:57:32,550 epoch 8 - iter 108/272 - loss 0.01100986 - time (sec): 6.97 - samples/sec: 2990.26 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:57:34,626 epoch 8 - iter 135/272 - loss 0.01325530 - time (sec): 9.05 - samples/sec: 2913.62 - lr: 0.000014 - momentum: 0.000000
2023-10-17 19:57:36,295 epoch 8 - iter 162/272 - loss 0.01235273 - time (sec): 10.72 - samples/sec: 2947.60 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:57:38,102 epoch 8 - iter 189/272 - loss 0.01435970 - time (sec): 12.53 - samples/sec: 3020.99 - lr: 0.000013 - momentum: 0.000000
2023-10-17 19:57:39,541 epoch 8 - iter 216/272 - loss 0.01417857 - time (sec): 13.96 - samples/sec: 3006.09 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:57:41,022 epoch 8 - iter 243/272 - loss 0.01414234 - time (sec): 15.45 - samples/sec: 2994.35 - lr: 0.000012 - momentum: 0.000000
2023-10-17 19:57:42,698 epoch 8 - iter 270/272 - loss 0.01356212 - time (sec): 17.12 - samples/sec: 3023.14 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:57:42,788 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:42,789 EPOCH 8 done: loss 0.0136 - lr: 0.000011
2023-10-17 19:57:44,251 DEV : loss 0.17592966556549072 - f1-score (micro avg) 0.8222
2023-10-17 19:57:44,256 saving best model
2023-10-17 19:57:44,791 ----------------------------------------------------------------------------------------------------
2023-10-17 19:57:46,426 epoch 9 - iter 27/272 - loss 0.00538360 - time (sec): 1.63 - samples/sec: 2942.35 - lr: 0.000011 - momentum: 0.000000
2023-10-17 19:57:48,183 epoch 9 - iter 54/272 - loss 0.00340815 - time (sec): 3.39 - samples/sec: 2946.64 - lr: 0.000010 - momentum: 0.000000
2023-10-17 19:57:49,700 epoch 9 - iter 81/272 - loss 0.00343662 - time (sec): 4.91 - samples/sec: 2873.13 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:57:51,553 epoch 9 - iter 108/272 - loss 0.00716874 - time (sec): 6.76 - samples/sec: 2978.89 - lr: 0.000009 - momentum: 0.000000
2023-10-17 19:57:53,125 epoch 9 - iter 135/272 - loss 0.00825452 - time (sec): 8.33 - samples/sec: 2975.95 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:57:54,779 epoch 9 - iter 162/272 - loss 0.00786866 - time (sec): 9.98 - samples/sec: 2975.76 - lr: 0.000008 - momentum: 0.000000
2023-10-17 19:57:56,585 epoch 9 - iter 189/272 - loss 0.00753915 - time (sec): 11.79 - samples/sec: 3091.69 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:57:58,248 epoch 9 - iter 216/272 - loss 0.00811762 - time (sec): 13.45 - samples/sec: 3103.59 - lr: 0.000007 - momentum: 0.000000
2023-10-17 19:57:59,862 epoch 9 - iter 243/272 - loss 0.00912190 - time (sec): 15.07 - samples/sec: 3068.80 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:58:01,590 epoch 9 - iter 270/272 - loss 0.00837205 - time (sec): 16.79 - samples/sec: 3084.14 - lr: 0.000006 - momentum: 0.000000
2023-10-17 19:58:01,677 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:01,678 EPOCH 9 done: loss 0.0083 - lr: 0.000006
2023-10-17 19:58:03,185 DEV : loss 0.18768392503261566 - f1-score (micro avg) 0.7963
2023-10-17 19:58:03,191 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:04,833 epoch 10 - iter 27/272 - loss 0.00264070 - time (sec): 1.64 - samples/sec: 3087.66 - lr: 0.000005 - momentum: 0.000000
2023-10-17 19:58:06,348 epoch 10 - iter 54/272 - loss 0.00313848 - time (sec): 3.16 - samples/sec: 3101.26 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:58:07,852 epoch 10 - iter 81/272 - loss 0.00255139 - time (sec): 4.66 - samples/sec: 3117.78 - lr: 0.000004 - momentum: 0.000000
2023-10-17 19:58:09,420 epoch 10 - iter 108/272 - loss 0.00238475 - time (sec): 6.23 - samples/sec: 3211.81 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:58:11,094 epoch 10 - iter 135/272 - loss 0.00263451 - time (sec): 7.90 - samples/sec: 3256.43 - lr: 0.000003 - momentum: 0.000000
2023-10-17 19:58:12,882 epoch 10 - iter 162/272 - loss 0.00305484 - time (sec): 9.69 - samples/sec: 3260.03 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:58:14,418 epoch 10 - iter 189/272 - loss 0.00319056 - time (sec): 11.23 - samples/sec: 3223.06 - lr: 0.000002 - momentum: 0.000000
2023-10-17 19:58:16,162 epoch 10 - iter 216/272 - loss 0.00368433 - time (sec): 12.97 - samples/sec: 3182.47 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:58:18,016 epoch 10 - iter 243/272 - loss 0.00572804 - time (sec): 14.82 - samples/sec: 3165.87 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:58:19,571 epoch 10 - iter 270/272 - loss 0.00528442 - time (sec): 16.38 - samples/sec: 3162.27 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:58:19,667 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:19,667 EPOCH 10 done: loss 0.0054 - lr: 0.000000
2023-10-17 19:58:21,160 DEV : loss 0.18381302058696747 - f1-score (micro avg) 0.8037
2023-10-17 19:58:21,585 ----------------------------------------------------------------------------------------------------
2023-10-17 19:58:21,586 Loading model from best epoch ...
2023-10-17 19:58:23,333 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 19:58:25,658
Results:
- F-score (micro) 0.8069
- F-score (macro) 0.7678
- Accuracy 0.6937
By class:
precision recall f1-score support
LOC 0.8140 0.8974 0.8537 312
PER 0.7258 0.8654 0.7895 208
ORG 0.5818 0.5818 0.5818 55
HumanProd 0.7333 1.0000 0.8462 22
micro avg 0.7592 0.8610 0.8069 597
macro avg 0.7137 0.8362 0.7678 597
weighted avg 0.7589 0.8610 0.8060 597
2023-10-17 19:58:25,658 ----------------------------------------------------------------------------------------------------