stefan-it's picture
Upload folder using huggingface_hub
ad21869
2023-10-13 08:26:08,641 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,642 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 Train: 1100 sentences
2023-10-13 08:26:08,643 (train_with_dev=False, train_with_test=False)
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 Training Params:
2023-10-13 08:26:08,643 - learning_rate: "3e-05"
2023-10-13 08:26:08,643 - mini_batch_size: "8"
2023-10-13 08:26:08,643 - max_epochs: "10"
2023-10-13 08:26:08,643 - shuffle: "True"
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 Plugins:
2023-10-13 08:26:08,643 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:26:08,643 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:26:08,643 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,643 Computation:
2023-10-13 08:26:08,644 - compute on device: cuda:0
2023-10-13 08:26:08,644 - embedding storage: none
2023-10-13 08:26:08,644 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,644 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 08:26:08,644 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:08,644 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:09,422 epoch 1 - iter 13/138 - loss 3.69385232 - time (sec): 0.78 - samples/sec: 3038.15 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:26:10,209 epoch 1 - iter 26/138 - loss 3.48751350 - time (sec): 1.56 - samples/sec: 2794.15 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:26:10,915 epoch 1 - iter 39/138 - loss 3.14421571 - time (sec): 2.27 - samples/sec: 2861.72 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:26:11,700 epoch 1 - iter 52/138 - loss 2.55104060 - time (sec): 3.06 - samples/sec: 2958.09 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:26:12,383 epoch 1 - iter 65/138 - loss 2.27203720 - time (sec): 3.74 - samples/sec: 2925.20 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:26:13,075 epoch 1 - iter 78/138 - loss 2.05008068 - time (sec): 4.43 - samples/sec: 2937.45 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:26:13,765 epoch 1 - iter 91/138 - loss 1.90458869 - time (sec): 5.12 - samples/sec: 2929.88 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:26:14,469 epoch 1 - iter 104/138 - loss 1.74598608 - time (sec): 5.82 - samples/sec: 3002.95 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:26:15,175 epoch 1 - iter 117/138 - loss 1.62592389 - time (sec): 6.53 - samples/sec: 2999.69 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:26:15,861 epoch 1 - iter 130/138 - loss 1.53245007 - time (sec): 7.22 - samples/sec: 2988.19 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:26:16,273 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:16,273 EPOCH 1 done: loss 1.4877 - lr: 0.000028
2023-10-13 08:26:17,009 DEV : loss 0.4509204924106598 - f1-score (micro avg) 0.252
2023-10-13 08:26:17,014 saving best model
2023-10-13 08:26:17,324 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:18,094 epoch 2 - iter 13/138 - loss 0.42619151 - time (sec): 0.77 - samples/sec: 3139.00 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:26:18,813 epoch 2 - iter 26/138 - loss 0.42325569 - time (sec): 1.49 - samples/sec: 2842.22 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:26:19,574 epoch 2 - iter 39/138 - loss 0.40997183 - time (sec): 2.25 - samples/sec: 2873.00 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:26:20,308 epoch 2 - iter 52/138 - loss 0.38690384 - time (sec): 2.98 - samples/sec: 2847.84 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:26:21,088 epoch 2 - iter 65/138 - loss 0.35398854 - time (sec): 3.76 - samples/sec: 2828.03 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:26:21,829 epoch 2 - iter 78/138 - loss 0.34914707 - time (sec): 4.50 - samples/sec: 2839.82 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:26:22,619 epoch 2 - iter 91/138 - loss 0.33145623 - time (sec): 5.29 - samples/sec: 2804.89 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:26:23,354 epoch 2 - iter 104/138 - loss 0.32408274 - time (sec): 6.03 - samples/sec: 2836.09 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:26:24,107 epoch 2 - iter 117/138 - loss 0.31812163 - time (sec): 6.78 - samples/sec: 2840.83 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:26:24,862 epoch 2 - iter 130/138 - loss 0.30813736 - time (sec): 7.54 - samples/sec: 2864.21 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:26:25,257 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:25,257 EPOCH 2 done: loss 0.3027 - lr: 0.000027
2023-10-13 08:26:25,943 DEV : loss 0.17636282742023468 - f1-score (micro avg) 0.778
2023-10-13 08:26:25,948 saving best model
2023-10-13 08:26:26,398 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:27,114 epoch 3 - iter 13/138 - loss 0.19209402 - time (sec): 0.71 - samples/sec: 2745.27 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:26:27,844 epoch 3 - iter 26/138 - loss 0.17352554 - time (sec): 1.44 - samples/sec: 2823.15 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:26:28,634 epoch 3 - iter 39/138 - loss 0.15390368 - time (sec): 2.23 - samples/sec: 2883.66 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:26:29,333 epoch 3 - iter 52/138 - loss 0.15118087 - time (sec): 2.93 - samples/sec: 2943.74 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:26:30,113 epoch 3 - iter 65/138 - loss 0.14765733 - time (sec): 3.71 - samples/sec: 2924.73 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:26:30,808 epoch 3 - iter 78/138 - loss 0.14673277 - time (sec): 4.41 - samples/sec: 2912.56 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:26:31,557 epoch 3 - iter 91/138 - loss 0.14147220 - time (sec): 5.16 - samples/sec: 2928.81 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:26:32,243 epoch 3 - iter 104/138 - loss 0.14096598 - time (sec): 5.84 - samples/sec: 2920.42 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:26:32,960 epoch 3 - iter 117/138 - loss 0.14195431 - time (sec): 6.56 - samples/sec: 2918.85 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:26:33,700 epoch 3 - iter 130/138 - loss 0.14176228 - time (sec): 7.30 - samples/sec: 2926.93 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:26:34,155 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:34,156 EPOCH 3 done: loss 0.1380 - lr: 0.000024
2023-10-13 08:26:34,849 DEV : loss 0.143646240234375 - f1-score (micro avg) 0.8205
2023-10-13 08:26:34,855 saving best model
2023-10-13 08:26:35,284 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:35,970 epoch 4 - iter 13/138 - loss 0.08454532 - time (sec): 0.68 - samples/sec: 3075.26 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:26:36,679 epoch 4 - iter 26/138 - loss 0.08622317 - time (sec): 1.39 - samples/sec: 3188.79 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:26:37,419 epoch 4 - iter 39/138 - loss 0.08311552 - time (sec): 2.13 - samples/sec: 3020.59 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:26:38,143 epoch 4 - iter 52/138 - loss 0.08780046 - time (sec): 2.86 - samples/sec: 3015.55 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:26:38,891 epoch 4 - iter 65/138 - loss 0.09046928 - time (sec): 3.61 - samples/sec: 2989.97 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:26:39,603 epoch 4 - iter 78/138 - loss 0.08806674 - time (sec): 4.32 - samples/sec: 2967.54 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:26:40,326 epoch 4 - iter 91/138 - loss 0.09086554 - time (sec): 5.04 - samples/sec: 2956.46 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:26:41,041 epoch 4 - iter 104/138 - loss 0.09148364 - time (sec): 5.75 - samples/sec: 2937.77 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:26:41,774 epoch 4 - iter 117/138 - loss 0.08793837 - time (sec): 6.49 - samples/sec: 2965.76 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:26:42,559 epoch 4 - iter 130/138 - loss 0.08716024 - time (sec): 7.27 - samples/sec: 2967.65 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:26:42,991 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:42,991 EPOCH 4 done: loss 0.0879 - lr: 0.000020
2023-10-13 08:26:43,743 DEV : loss 0.11688197404146194 - f1-score (micro avg) 0.8363
2023-10-13 08:26:43,751 saving best model
2023-10-13 08:26:44,217 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:44,914 epoch 5 - iter 13/138 - loss 0.06018758 - time (sec): 0.70 - samples/sec: 3152.46 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:26:45,613 epoch 5 - iter 26/138 - loss 0.05888432 - time (sec): 1.40 - samples/sec: 3154.60 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:26:46,320 epoch 5 - iter 39/138 - loss 0.07909944 - time (sec): 2.10 - samples/sec: 3081.62 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:26:47,012 epoch 5 - iter 52/138 - loss 0.07517801 - time (sec): 2.79 - samples/sec: 3044.85 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:26:47,695 epoch 5 - iter 65/138 - loss 0.07414495 - time (sec): 3.48 - samples/sec: 3081.70 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:26:48,384 epoch 5 - iter 78/138 - loss 0.06957352 - time (sec): 4.17 - samples/sec: 3091.23 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:26:49,177 epoch 5 - iter 91/138 - loss 0.07023772 - time (sec): 4.96 - samples/sec: 3029.09 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:26:49,892 epoch 5 - iter 104/138 - loss 0.06663683 - time (sec): 5.67 - samples/sec: 3001.18 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:26:50,631 epoch 5 - iter 117/138 - loss 0.06142101 - time (sec): 6.41 - samples/sec: 2987.60 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:26:51,381 epoch 5 - iter 130/138 - loss 0.06264685 - time (sec): 7.16 - samples/sec: 2981.96 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:26:51,831 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:51,831 EPOCH 5 done: loss 0.0616 - lr: 0.000017
2023-10-13 08:26:52,525 DEV : loss 0.13779088854789734 - f1-score (micro avg) 0.8491
2023-10-13 08:26:52,529 saving best model
2023-10-13 08:26:52,961 ----------------------------------------------------------------------------------------------------
2023-10-13 08:26:53,703 epoch 6 - iter 13/138 - loss 0.04866097 - time (sec): 0.74 - samples/sec: 2926.90 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:26:54,438 epoch 6 - iter 26/138 - loss 0.04983253 - time (sec): 1.47 - samples/sec: 3002.73 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:26:55,179 epoch 6 - iter 39/138 - loss 0.04671730 - time (sec): 2.21 - samples/sec: 2963.62 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:26:55,881 epoch 6 - iter 52/138 - loss 0.04256161 - time (sec): 2.91 - samples/sec: 2936.14 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:26:56,642 epoch 6 - iter 65/138 - loss 0.04915792 - time (sec): 3.68 - samples/sec: 2898.60 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:26:57,370 epoch 6 - iter 78/138 - loss 0.05052720 - time (sec): 4.40 - samples/sec: 2902.03 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:26:58,105 epoch 6 - iter 91/138 - loss 0.04712163 - time (sec): 5.14 - samples/sec: 2912.36 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:26:58,791 epoch 6 - iter 104/138 - loss 0.04683642 - time (sec): 5.82 - samples/sec: 2920.37 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:26:59,545 epoch 6 - iter 117/138 - loss 0.04608847 - time (sec): 6.58 - samples/sec: 2921.76 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:27:00,304 epoch 6 - iter 130/138 - loss 0.04480660 - time (sec): 7.34 - samples/sec: 2928.18 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:27:00,745 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:00,745 EPOCH 6 done: loss 0.0445 - lr: 0.000014
2023-10-13 08:27:01,427 DEV : loss 0.15820646286010742 - f1-score (micro avg) 0.8578
2023-10-13 08:27:01,432 saving best model
2023-10-13 08:27:01,866 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:02,596 epoch 7 - iter 13/138 - loss 0.01760101 - time (sec): 0.73 - samples/sec: 2835.15 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:27:03,364 epoch 7 - iter 26/138 - loss 0.01763718 - time (sec): 1.50 - samples/sec: 2876.71 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:27:04,079 epoch 7 - iter 39/138 - loss 0.01604980 - time (sec): 2.21 - samples/sec: 2822.15 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:27:04,825 epoch 7 - iter 52/138 - loss 0.02286619 - time (sec): 2.96 - samples/sec: 2914.57 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:27:05,547 epoch 7 - iter 65/138 - loss 0.02495223 - time (sec): 3.68 - samples/sec: 2919.90 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:27:06,256 epoch 7 - iter 78/138 - loss 0.03142391 - time (sec): 4.39 - samples/sec: 2942.85 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:27:06,968 epoch 7 - iter 91/138 - loss 0.03459948 - time (sec): 5.10 - samples/sec: 2956.26 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:27:07,691 epoch 7 - iter 104/138 - loss 0.04089677 - time (sec): 5.82 - samples/sec: 2977.62 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:27:08,418 epoch 7 - iter 117/138 - loss 0.03890265 - time (sec): 6.55 - samples/sec: 2961.50 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:27:09,142 epoch 7 - iter 130/138 - loss 0.03775231 - time (sec): 7.27 - samples/sec: 2945.83 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:27:09,593 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:09,593 EPOCH 7 done: loss 0.0364 - lr: 0.000010
2023-10-13 08:27:10,332 DEV : loss 0.16149461269378662 - f1-score (micro avg) 0.8694
2023-10-13 08:27:10,341 saving best model
2023-10-13 08:27:10,872 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:11,670 epoch 8 - iter 13/138 - loss 0.02451766 - time (sec): 0.78 - samples/sec: 2718.28 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:27:12,481 epoch 8 - iter 26/138 - loss 0.01900098 - time (sec): 1.59 - samples/sec: 2654.23 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:27:13,274 epoch 8 - iter 39/138 - loss 0.03325250 - time (sec): 2.39 - samples/sec: 2719.54 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:27:14,039 epoch 8 - iter 52/138 - loss 0.03077508 - time (sec): 3.15 - samples/sec: 2655.42 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:27:14,872 epoch 8 - iter 65/138 - loss 0.02914906 - time (sec): 3.99 - samples/sec: 2673.98 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:27:15,649 epoch 8 - iter 78/138 - loss 0.02905655 - time (sec): 4.76 - samples/sec: 2691.84 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:27:16,390 epoch 8 - iter 91/138 - loss 0.03090136 - time (sec): 5.50 - samples/sec: 2750.11 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:27:17,116 epoch 8 - iter 104/138 - loss 0.02860076 - time (sec): 6.23 - samples/sec: 2779.83 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:27:17,831 epoch 8 - iter 117/138 - loss 0.02699396 - time (sec): 6.95 - samples/sec: 2788.13 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:27:18,668 epoch 8 - iter 130/138 - loss 0.02432561 - time (sec): 7.78 - samples/sec: 2791.72 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:27:19,073 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:19,073 EPOCH 8 done: loss 0.0255 - lr: 0.000007
2023-10-13 08:27:19,745 DEV : loss 0.16744698584079742 - f1-score (micro avg) 0.8701
2023-10-13 08:27:19,751 saving best model
2023-10-13 08:27:20,202 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:20,944 epoch 9 - iter 13/138 - loss 0.00538376 - time (sec): 0.73 - samples/sec: 3011.35 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:27:21,667 epoch 9 - iter 26/138 - loss 0.02331938 - time (sec): 1.45 - samples/sec: 2964.44 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:27:22,408 epoch 9 - iter 39/138 - loss 0.01893327 - time (sec): 2.19 - samples/sec: 2859.27 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:27:23,198 epoch 9 - iter 52/138 - loss 0.01939908 - time (sec): 2.98 - samples/sec: 2919.64 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:27:23,945 epoch 9 - iter 65/138 - loss 0.02382819 - time (sec): 3.73 - samples/sec: 2912.31 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:27:24,724 epoch 9 - iter 78/138 - loss 0.02378557 - time (sec): 4.51 - samples/sec: 2905.29 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:27:25,481 epoch 9 - iter 91/138 - loss 0.02458101 - time (sec): 5.27 - samples/sec: 2872.17 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:27:26,237 epoch 9 - iter 104/138 - loss 0.02381904 - time (sec): 6.02 - samples/sec: 2883.72 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:27:26,959 epoch 9 - iter 117/138 - loss 0.02435136 - time (sec): 6.74 - samples/sec: 2889.26 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:27:27,656 epoch 9 - iter 130/138 - loss 0.02457447 - time (sec): 7.44 - samples/sec: 2874.68 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:27:28,130 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:28,130 EPOCH 9 done: loss 0.0244 - lr: 0.000004
2023-10-13 08:27:28,844 DEV : loss 0.16501910984516144 - f1-score (micro avg) 0.8741
2023-10-13 08:27:28,850 saving best model
2023-10-13 08:27:29,280 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:30,048 epoch 10 - iter 13/138 - loss 0.02703412 - time (sec): 0.77 - samples/sec: 3082.15 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:27:30,794 epoch 10 - iter 26/138 - loss 0.01688742 - time (sec): 1.51 - samples/sec: 2890.78 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:27:31,538 epoch 10 - iter 39/138 - loss 0.01465273 - time (sec): 2.26 - samples/sec: 2855.54 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:27:32,259 epoch 10 - iter 52/138 - loss 0.01768204 - time (sec): 2.98 - samples/sec: 2879.42 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:27:32,978 epoch 10 - iter 65/138 - loss 0.01594072 - time (sec): 3.70 - samples/sec: 2851.87 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:27:33,760 epoch 10 - iter 78/138 - loss 0.01490758 - time (sec): 4.48 - samples/sec: 2852.30 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:27:34,525 epoch 10 - iter 91/138 - loss 0.01513863 - time (sec): 5.24 - samples/sec: 2876.63 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:27:35,251 epoch 10 - iter 104/138 - loss 0.01506427 - time (sec): 5.97 - samples/sec: 2910.48 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:27:35,944 epoch 10 - iter 117/138 - loss 0.01598269 - time (sec): 6.66 - samples/sec: 2900.63 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:27:36,673 epoch 10 - iter 130/138 - loss 0.01793335 - time (sec): 7.39 - samples/sec: 2895.22 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:27:37,116 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:37,116 EPOCH 10 done: loss 0.0179 - lr: 0.000000
2023-10-13 08:27:37,847 DEV : loss 0.16583986580371857 - f1-score (micro avg) 0.8694
2023-10-13 08:27:38,184 ----------------------------------------------------------------------------------------------------
2023-10-13 08:27:38,185 Loading model from best epoch ...
2023-10-13 08:27:39,730 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:27:40,442
Results:
- F-score (micro) 0.89
- F-score (macro) 0.6905
- Accuracy 0.819
By class:
precision recall f1-score support
scope 0.8649 0.9091 0.8864 176
pers 0.9380 0.9453 0.9416 128
work 0.8243 0.8243 0.8243 74
loc 0.6667 1.0000 0.8000 2
object 0.0000 0.0000 0.0000 2
micro avg 0.8798 0.9005 0.8900 382
macro avg 0.6588 0.7357 0.6905 382
weighted avg 0.8759 0.9005 0.8878 382
2023-10-13 08:27:40,442 ----------------------------------------------------------------------------------------------------