2023-10-18 21:15:31,626 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,626 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Train: 7936 sentences 2023-10-18 21:15:31,627 (train_with_dev=False, train_with_test=False) 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Training Params: 2023-10-18 21:15:31,627 - learning_rate: "5e-05" 2023-10-18 21:15:31,627 - mini_batch_size: "8" 2023-10-18 21:15:31,627 - max_epochs: "10" 2023-10-18 21:15:31,627 - shuffle: "True" 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Plugins: 2023-10-18 21:15:31,627 - TensorboardLogger 2023-10-18 21:15:31,627 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 21:15:31,627 - metric: "('micro avg', 'f1-score')" 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Computation: 2023-10-18 21:15:31,627 - compute on device: cuda:0 2023-10-18 21:15:31,627 - embedding storage: none 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,627 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:31,628 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 21:15:33,871 epoch 1 - iter 99/992 - loss 3.00035148 - time (sec): 2.24 - samples/sec: 7504.52 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:15:36,122 epoch 1 - iter 198/992 - loss 2.65024826 - time (sec): 4.49 - samples/sec: 7377.52 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:15:38,453 epoch 1 - iter 297/992 - loss 2.14493534 - time (sec): 6.83 - samples/sec: 7396.19 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:15:40,761 epoch 1 - iter 396/992 - loss 1.77405864 - time (sec): 9.13 - samples/sec: 7228.32 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:15:42,985 epoch 1 - iter 495/992 - loss 1.52199448 - time (sec): 11.36 - samples/sec: 7224.42 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:15:45,245 epoch 1 - iter 594/992 - loss 1.33697784 - time (sec): 13.62 - samples/sec: 7236.13 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:15:47,493 epoch 1 - iter 693/992 - loss 1.20151051 - time (sec): 15.87 - samples/sec: 7232.93 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:15:49,727 epoch 1 - iter 792/992 - loss 1.09367873 - time (sec): 18.10 - samples/sec: 7246.04 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:15:51,954 epoch 1 - iter 891/992 - loss 1.00765596 - time (sec): 20.33 - samples/sec: 7260.10 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:15:54,200 epoch 1 - iter 990/992 - loss 0.94167931 - time (sec): 22.57 - samples/sec: 7252.23 - lr: 0.000050 - momentum: 0.000000 2023-10-18 21:15:54,247 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:54,247 EPOCH 1 done: loss 0.9409 - lr: 0.000050 2023-10-18 21:15:55,813 DEV : loss 0.20528535544872284 - f1-score (micro avg) 0.2927 2023-10-18 21:15:55,832 saving best model 2023-10-18 21:15:55,870 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:15:58,007 epoch 2 - iter 99/992 - loss 0.29525902 - time (sec): 2.14 - samples/sec: 7453.35 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:16:00,244 epoch 2 - iter 198/992 - loss 0.30020282 - time (sec): 4.37 - samples/sec: 7681.68 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:16:02,448 epoch 2 - iter 297/992 - loss 0.29276113 - time (sec): 6.58 - samples/sec: 7608.66 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:16:04,695 epoch 2 - iter 396/992 - loss 0.28180797 - time (sec): 8.82 - samples/sec: 7520.52 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:16:06,951 epoch 2 - iter 495/992 - loss 0.27842310 - time (sec): 11.08 - samples/sec: 7417.17 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:16:09,180 epoch 2 - iter 594/992 - loss 0.27099342 - time (sec): 13.31 - samples/sec: 7460.02 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:16:11,436 epoch 2 - iter 693/992 - loss 0.26826560 - time (sec): 15.57 - samples/sec: 7433.50 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:16:13,617 epoch 2 - iter 792/992 - loss 0.26477509 - time (sec): 17.75 - samples/sec: 7393.62 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:16:15,758 epoch 2 - iter 891/992 - loss 0.26317298 - time (sec): 19.89 - samples/sec: 7333.02 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:16:17,807 epoch 2 - iter 990/992 - loss 0.25944860 - time (sec): 21.94 - samples/sec: 7465.20 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:16:17,844 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:16:17,845 EPOCH 2 done: loss 0.2593 - lr: 0.000044 2023-10-18 21:16:20,056 DEV : loss 0.16714806854724884 - f1-score (micro avg) 0.4111 2023-10-18 21:16:20,075 saving best model 2023-10-18 21:16:20,110 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:16:22,218 epoch 3 - iter 99/992 - loss 0.23036403 - time (sec): 2.11 - samples/sec: 7812.27 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:16:24,469 epoch 3 - iter 198/992 - loss 0.23058276 - time (sec): 4.36 - samples/sec: 7626.43 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:16:26,711 epoch 3 - iter 297/992 - loss 0.21820133 - time (sec): 6.60 - samples/sec: 7536.57 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:16:28,916 epoch 3 - iter 396/992 - loss 0.22504416 - time (sec): 8.81 - samples/sec: 7493.40 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:16:31,138 epoch 3 - iter 495/992 - loss 0.22164956 - time (sec): 11.03 - samples/sec: 7402.51 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:16:33,390 epoch 3 - iter 594/992 - loss 0.22088483 - time (sec): 13.28 - samples/sec: 7400.22 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:16:35,663 epoch 3 - iter 693/992 - loss 0.21848559 - time (sec): 15.55 - samples/sec: 7360.05 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:16:38,037 epoch 3 - iter 792/992 - loss 0.21608363 - time (sec): 17.93 - samples/sec: 7370.97 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:16:40,357 epoch 3 - iter 891/992 - loss 0.21449733 - time (sec): 20.25 - samples/sec: 7305.29 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:16:42,642 epoch 3 - iter 990/992 - loss 0.21398506 - time (sec): 22.53 - samples/sec: 7266.44 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:16:42,687 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:16:42,688 EPOCH 3 done: loss 0.2139 - lr: 0.000039 2023-10-18 21:16:44,501 DEV : loss 0.14900191128253937 - f1-score (micro avg) 0.5128 2023-10-18 21:16:44,520 saving best model 2023-10-18 21:16:44,555 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:16:46,843 epoch 4 - iter 99/992 - loss 0.19212402 - time (sec): 2.29 - samples/sec: 7423.04 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:16:49,021 epoch 4 - iter 198/992 - loss 0.19042800 - time (sec): 4.47 - samples/sec: 7091.02 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:16:51,197 epoch 4 - iter 297/992 - loss 0.18919111 - time (sec): 6.64 - samples/sec: 7171.20 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:16:53,390 epoch 4 - iter 396/992 - loss 0.18793688 - time (sec): 8.83 - samples/sec: 7224.86 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:16:55,585 epoch 4 - iter 495/992 - loss 0.19170755 - time (sec): 11.03 - samples/sec: 7327.72 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:16:57,801 epoch 4 - iter 594/992 - loss 0.18825759 - time (sec): 13.25 - samples/sec: 7363.35 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:17:00,013 epoch 4 - iter 693/992 - loss 0.19066190 - time (sec): 15.46 - samples/sec: 7338.86 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:17:02,303 epoch 4 - iter 792/992 - loss 0.18828028 - time (sec): 17.75 - samples/sec: 7312.02 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:17:04,532 epoch 4 - iter 891/992 - loss 0.18940494 - time (sec): 19.98 - samples/sec: 7301.53 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:17:06,861 epoch 4 - iter 990/992 - loss 0.18948850 - time (sec): 22.31 - samples/sec: 7335.54 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:17:06,913 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:06,913 EPOCH 4 done: loss 0.1897 - lr: 0.000033 2023-10-18 21:17:08,764 DEV : loss 0.14117993414402008 - f1-score (micro avg) 0.5446 2023-10-18 21:17:08,783 saving best model 2023-10-18 21:17:08,817 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:11,009 epoch 5 - iter 99/992 - loss 0.15229802 - time (sec): 2.19 - samples/sec: 7372.22 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:17:13,239 epoch 5 - iter 198/992 - loss 0.16129571 - time (sec): 4.42 - samples/sec: 7336.97 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:17:15,485 epoch 5 - iter 297/992 - loss 0.16681954 - time (sec): 6.67 - samples/sec: 7239.47 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:17:17,603 epoch 5 - iter 396/992 - loss 0.16422500 - time (sec): 8.78 - samples/sec: 7408.94 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:17:19,858 epoch 5 - iter 495/992 - loss 0.16568909 - time (sec): 11.04 - samples/sec: 7348.81 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:17:22,037 epoch 5 - iter 594/992 - loss 0.16934167 - time (sec): 13.22 - samples/sec: 7304.06 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:17:24,279 epoch 5 - iter 693/992 - loss 0.17025896 - time (sec): 15.46 - samples/sec: 7320.71 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:17:26,566 epoch 5 - iter 792/992 - loss 0.17179453 - time (sec): 17.75 - samples/sec: 7355.20 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:17:28,773 epoch 5 - iter 891/992 - loss 0.17143573 - time (sec): 19.96 - samples/sec: 7392.66 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:17:30,997 epoch 5 - iter 990/992 - loss 0.17025017 - time (sec): 22.18 - samples/sec: 7379.74 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:17:31,042 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:31,042 EPOCH 5 done: loss 0.1701 - lr: 0.000028 2023-10-18 21:17:32,868 DEV : loss 0.1357688307762146 - f1-score (micro avg) 0.5582 2023-10-18 21:17:32,887 saving best model 2023-10-18 21:17:32,922 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:35,146 epoch 6 - iter 99/992 - loss 0.17300109 - time (sec): 2.22 - samples/sec: 7210.04 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:17:37,367 epoch 6 - iter 198/992 - loss 0.17503273 - time (sec): 4.44 - samples/sec: 7303.51 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:17:39,620 epoch 6 - iter 297/992 - loss 0.16777607 - time (sec): 6.70 - samples/sec: 7436.18 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:17:41,855 epoch 6 - iter 396/992 - loss 0.16541905 - time (sec): 8.93 - samples/sec: 7339.38 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:17:44,101 epoch 6 - iter 495/992 - loss 0.16309203 - time (sec): 11.18 - samples/sec: 7376.75 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:17:46,202 epoch 6 - iter 594/992 - loss 0.16325905 - time (sec): 13.28 - samples/sec: 7387.33 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:17:48,434 epoch 6 - iter 693/992 - loss 0.16011057 - time (sec): 15.51 - samples/sec: 7404.51 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:17:50,641 epoch 6 - iter 792/992 - loss 0.15712117 - time (sec): 17.72 - samples/sec: 7412.21 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:17:52,843 epoch 6 - iter 891/992 - loss 0.16041497 - time (sec): 19.92 - samples/sec: 7364.21 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:17:55,110 epoch 6 - iter 990/992 - loss 0.15981362 - time (sec): 22.19 - samples/sec: 7375.09 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:17:55,159 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:55,159 EPOCH 6 done: loss 0.1597 - lr: 0.000022 2023-10-18 21:17:57,029 DEV : loss 0.13293296098709106 - f1-score (micro avg) 0.5787 2023-10-18 21:17:57,048 saving best model 2023-10-18 21:17:57,082 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:17:59,315 epoch 7 - iter 99/992 - loss 0.13942076 - time (sec): 2.23 - samples/sec: 7690.95 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:18:01,494 epoch 7 - iter 198/992 - loss 0.14325448 - time (sec): 4.41 - samples/sec: 7803.96 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:18:03,759 epoch 7 - iter 297/992 - loss 0.14913377 - time (sec): 6.68 - samples/sec: 7639.00 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:18:05,986 epoch 7 - iter 396/992 - loss 0.15476655 - time (sec): 8.90 - samples/sec: 7534.24 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:18:08,218 epoch 7 - iter 495/992 - loss 0.15347097 - time (sec): 11.14 - samples/sec: 7513.37 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:18:10,453 epoch 7 - iter 594/992 - loss 0.15068835 - time (sec): 13.37 - samples/sec: 7485.75 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:18:12,712 epoch 7 - iter 693/992 - loss 0.14857101 - time (sec): 15.63 - samples/sec: 7470.10 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:18:14,906 epoch 7 - iter 792/992 - loss 0.14959639 - time (sec): 17.82 - samples/sec: 7457.26 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:18:17,123 epoch 7 - iter 891/992 - loss 0.14984248 - time (sec): 20.04 - samples/sec: 7383.31 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:18:19,314 epoch 7 - iter 990/992 - loss 0.15087522 - time (sec): 22.23 - samples/sec: 7355.24 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:18:19,366 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:18:19,366 EPOCH 7 done: loss 0.1508 - lr: 0.000017 2023-10-18 21:18:21,594 DEV : loss 0.13094773888587952 - f1-score (micro avg) 0.5961 2023-10-18 21:18:21,612 saving best model 2023-10-18 21:18:21,648 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:18:23,910 epoch 8 - iter 99/992 - loss 0.13743854 - time (sec): 2.26 - samples/sec: 7400.23 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:18:26,262 epoch 8 - iter 198/992 - loss 0.14110764 - time (sec): 4.61 - samples/sec: 7204.42 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:18:28,496 epoch 8 - iter 297/992 - loss 0.14324894 - time (sec): 6.85 - samples/sec: 7150.71 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:18:30,689 epoch 8 - iter 396/992 - loss 0.14564214 - time (sec): 9.04 - samples/sec: 7201.23 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:18:32,915 epoch 8 - iter 495/992 - loss 0.14616240 - time (sec): 11.27 - samples/sec: 7330.10 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:18:35,175 epoch 8 - iter 594/992 - loss 0.14778882 - time (sec): 13.53 - samples/sec: 7299.12 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:18:37,371 epoch 8 - iter 693/992 - loss 0.14760928 - time (sec): 15.72 - samples/sec: 7260.68 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:18:39,585 epoch 8 - iter 792/992 - loss 0.14535005 - time (sec): 17.94 - samples/sec: 7304.22 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:18:41,814 epoch 8 - iter 891/992 - loss 0.14591531 - time (sec): 20.16 - samples/sec: 7302.19 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:18:44,017 epoch 8 - iter 990/992 - loss 0.14527308 - time (sec): 22.37 - samples/sec: 7313.69 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:18:44,065 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:18:44,065 EPOCH 8 done: loss 0.1451 - lr: 0.000011 2023-10-18 21:18:45,896 DEV : loss 0.13168883323669434 - f1-score (micro avg) 0.5982 2023-10-18 21:18:45,915 saving best model 2023-10-18 21:18:45,951 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:18:48,175 epoch 9 - iter 99/992 - loss 0.13619358 - time (sec): 2.22 - samples/sec: 7282.70 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:18:50,442 epoch 9 - iter 198/992 - loss 0.13633010 - time (sec): 4.49 - samples/sec: 7272.94 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:18:52,662 epoch 9 - iter 297/992 - loss 0.13650790 - time (sec): 6.71 - samples/sec: 7255.07 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:18:54,868 epoch 9 - iter 396/992 - loss 0.13657536 - time (sec): 8.92 - samples/sec: 7228.79 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:18:57,157 epoch 9 - iter 495/992 - loss 0.13625592 - time (sec): 11.21 - samples/sec: 7357.39 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:18:59,407 epoch 9 - iter 594/992 - loss 0.13653751 - time (sec): 13.46 - samples/sec: 7356.54 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:19:01,641 epoch 9 - iter 693/992 - loss 0.13850946 - time (sec): 15.69 - samples/sec: 7338.11 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:19:03,933 epoch 9 - iter 792/992 - loss 0.13733639 - time (sec): 17.98 - samples/sec: 7322.44 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:19:06,261 epoch 9 - iter 891/992 - loss 0.13746495 - time (sec): 20.31 - samples/sec: 7268.68 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:19:08,482 epoch 9 - iter 990/992 - loss 0.13949435 - time (sec): 22.53 - samples/sec: 7265.99 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:19:08,526 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:19:08,527 EPOCH 9 done: loss 0.1393 - lr: 0.000006 2023-10-18 21:19:10,353 DEV : loss 0.13056860864162445 - f1-score (micro avg) 0.6056 2023-10-18 21:19:10,372 saving best model 2023-10-18 21:19:10,406 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:19:12,669 epoch 10 - iter 99/992 - loss 0.13940439 - time (sec): 2.26 - samples/sec: 7075.75 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:19:14,890 epoch 10 - iter 198/992 - loss 0.13866219 - time (sec): 4.48 - samples/sec: 7328.81 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:19:17,089 epoch 10 - iter 297/992 - loss 0.13654053 - time (sec): 6.68 - samples/sec: 7366.53 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:19:19,280 epoch 10 - iter 396/992 - loss 0.13952155 - time (sec): 8.87 - samples/sec: 7422.76 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:19:21,607 epoch 10 - iter 495/992 - loss 0.13874725 - time (sec): 11.20 - samples/sec: 7386.53 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:19:23,650 epoch 10 - iter 594/992 - loss 0.14103061 - time (sec): 13.24 - samples/sec: 7453.94 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:19:25,737 epoch 10 - iter 693/992 - loss 0.14107427 - time (sec): 15.33 - samples/sec: 7458.99 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:19:27,975 epoch 10 - iter 792/992 - loss 0.13930647 - time (sec): 17.57 - samples/sec: 7445.01 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:19:30,292 epoch 10 - iter 891/992 - loss 0.14078958 - time (sec): 19.89 - samples/sec: 7377.90 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:19:32,519 epoch 10 - iter 990/992 - loss 0.13836888 - time (sec): 22.11 - samples/sec: 7400.86 - lr: 0.000000 - momentum: 0.000000 2023-10-18 21:19:32,565 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:19:32,565 EPOCH 10 done: loss 0.1383 - lr: 0.000000 2023-10-18 21:19:34,388 DEV : loss 0.13230597972869873 - f1-score (micro avg) 0.6031 2023-10-18 21:19:34,434 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:19:34,435 Loading model from best epoch ... 2023-10-18 21:19:34,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 21:19:36,025 Results: - F-score (micro) 0.6226 - F-score (macro) 0.4557 - Accuracy 0.4912 By class: precision recall f1-score support LOC 0.7189 0.7496 0.7339 655 PER 0.4162 0.6233 0.4991 223 ORG 0.2973 0.0866 0.1341 127 micro avg 0.6082 0.6378 0.6226 1005 macro avg 0.4775 0.4865 0.4557 1005 weighted avg 0.5984 0.6378 0.6060 1005 2023-10-18 21:19:36,025 ----------------------------------------------------------------------------------------------------