2023-10-18 20:32:17,044 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,044 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 20:32:17,044 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,044 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 20:32:17,044 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,044 Train: 7936 sentences 2023-10-18 20:32:17,044 (train_with_dev=False, train_with_test=False) 2023-10-18 20:32:17,044 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Training Params: 2023-10-18 20:32:17,045 - learning_rate: "3e-05" 2023-10-18 20:32:17,045 - mini_batch_size: "8" 2023-10-18 20:32:17,045 - max_epochs: "10" 2023-10-18 20:32:17,045 - shuffle: "True" 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Plugins: 2023-10-18 20:32:17,045 - TensorboardLogger 2023-10-18 20:32:17,045 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 20:32:17,045 - metric: "('micro avg', 'f1-score')" 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Computation: 2023-10-18 20:32:17,045 - compute on device: cuda:0 2023-10-18 20:32:17,045 - embedding storage: none 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:17,045 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 20:32:19,153 epoch 1 - iter 99/992 - loss 3.26900601 - time (sec): 2.11 - samples/sec: 7728.61 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:32:21,668 epoch 1 - iter 198/992 - loss 3.01174688 - time (sec): 4.62 - samples/sec: 7025.90 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:32:23,881 epoch 1 - iter 297/992 - loss 2.64131737 - time (sec): 6.84 - samples/sec: 7198.35 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:32:26,462 epoch 1 - iter 396/992 - loss 2.23150630 - time (sec): 9.42 - samples/sec: 6982.05 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:32:28,635 epoch 1 - iter 495/992 - loss 1.90904405 - time (sec): 11.59 - samples/sec: 7026.98 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:32:30,871 epoch 1 - iter 594/992 - loss 1.67059385 - time (sec): 13.83 - samples/sec: 7081.35 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:32:33,068 epoch 1 - iter 693/992 - loss 1.49967758 - time (sec): 16.02 - samples/sec: 7114.89 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:32:35,281 epoch 1 - iter 792/992 - loss 1.36387729 - time (sec): 18.24 - samples/sec: 7181.71 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:32:37,528 epoch 1 - iter 891/992 - loss 1.25925333 - time (sec): 20.48 - samples/sec: 7183.73 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:32:39,760 epoch 1 - iter 990/992 - loss 1.17027992 - time (sec): 22.71 - samples/sec: 7203.37 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:32:39,805 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:39,805 EPOCH 1 done: loss 1.1686 - lr: 0.000030 2023-10-18 20:32:41,275 DEV : loss 0.25871363282203674 - f1-score (micro avg) 0.1041 2023-10-18 20:32:41,293 saving best model 2023-10-18 20:32:41,328 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:32:43,608 epoch 2 - iter 99/992 - loss 0.38667417 - time (sec): 2.28 - samples/sec: 7457.05 - lr: 0.000030 - momentum: 0.000000 2023-10-18 20:32:45,809 epoch 2 - iter 198/992 - loss 0.35532658 - time (sec): 4.48 - samples/sec: 7551.86 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:32:48,030 epoch 2 - iter 297/992 - loss 0.34540051 - time (sec): 6.70 - samples/sec: 7473.14 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:32:50,309 epoch 2 - iter 396/992 - loss 0.33013814 - time (sec): 8.98 - samples/sec: 7347.87 - lr: 0.000029 - momentum: 0.000000 2023-10-18 20:32:52,637 epoch 2 - iter 495/992 - loss 0.32197045 - time (sec): 11.31 - samples/sec: 7343.57 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:32:54,913 epoch 2 - iter 594/992 - loss 0.31801653 - time (sec): 13.59 - samples/sec: 7323.60 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:32:57,144 epoch 2 - iter 693/992 - loss 0.30876076 - time (sec): 15.82 - samples/sec: 7351.65 - lr: 0.000028 - momentum: 0.000000 2023-10-18 20:32:59,360 epoch 2 - iter 792/992 - loss 0.30673219 - time (sec): 18.03 - samples/sec: 7354.43 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:33:01,529 epoch 2 - iter 891/992 - loss 0.30357404 - time (sec): 20.20 - samples/sec: 7336.57 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:33:03,726 epoch 2 - iter 990/992 - loss 0.30024812 - time (sec): 22.40 - samples/sec: 7305.76 - lr: 0.000027 - momentum: 0.000000 2023-10-18 20:33:03,769 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:03,769 EPOCH 2 done: loss 0.3001 - lr: 0.000027 2023-10-18 20:33:05,604 DEV : loss 0.1853763908147812 - f1-score (micro avg) 0.3477 2023-10-18 20:33:05,622 saving best model 2023-10-18 20:33:05,658 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:07,896 epoch 3 - iter 99/992 - loss 0.23413942 - time (sec): 2.24 - samples/sec: 7285.51 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:33:10,145 epoch 3 - iter 198/992 - loss 0.22991950 - time (sec): 4.49 - samples/sec: 7312.34 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:33:12,378 epoch 3 - iter 297/992 - loss 0.25254134 - time (sec): 6.72 - samples/sec: 7251.19 - lr: 0.000026 - momentum: 0.000000 2023-10-18 20:33:14,643 epoch 3 - iter 396/992 - loss 0.24877043 - time (sec): 8.98 - samples/sec: 7308.99 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:33:16,809 epoch 3 - iter 495/992 - loss 0.24911355 - time (sec): 11.15 - samples/sec: 7288.56 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:33:19,031 epoch 3 - iter 594/992 - loss 0.24658049 - time (sec): 13.37 - samples/sec: 7337.19 - lr: 0.000025 - momentum: 0.000000 2023-10-18 20:33:21,280 epoch 3 - iter 693/992 - loss 0.24936620 - time (sec): 15.62 - samples/sec: 7303.17 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:33:23,521 epoch 3 - iter 792/992 - loss 0.24772256 - time (sec): 17.86 - samples/sec: 7324.51 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:33:25,756 epoch 3 - iter 891/992 - loss 0.24405967 - time (sec): 20.10 - samples/sec: 7330.03 - lr: 0.000024 - momentum: 0.000000 2023-10-18 20:33:27,957 epoch 3 - iter 990/992 - loss 0.24272779 - time (sec): 22.30 - samples/sec: 7334.09 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:33:28,002 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:28,002 EPOCH 3 done: loss 0.2428 - lr: 0.000023 2023-10-18 20:33:30,215 DEV : loss 0.16335979104042053 - f1-score (micro avg) 0.4241 2023-10-18 20:33:30,233 saving best model 2023-10-18 20:33:30,266 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:32,469 epoch 4 - iter 99/992 - loss 0.22443699 - time (sec): 2.20 - samples/sec: 7430.86 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:33:34,801 epoch 4 - iter 198/992 - loss 0.23701210 - time (sec): 4.53 - samples/sec: 7160.62 - lr: 0.000023 - momentum: 0.000000 2023-10-18 20:33:37,037 epoch 4 - iter 297/992 - loss 0.23242108 - time (sec): 6.77 - samples/sec: 7084.78 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:33:39,238 epoch 4 - iter 396/992 - loss 0.23657149 - time (sec): 8.97 - samples/sec: 7067.25 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:33:41,445 epoch 4 - iter 495/992 - loss 0.23101340 - time (sec): 11.18 - samples/sec: 7103.06 - lr: 0.000022 - momentum: 0.000000 2023-10-18 20:33:43,690 epoch 4 - iter 594/992 - loss 0.22285698 - time (sec): 13.42 - samples/sec: 7137.36 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:33:45,950 epoch 4 - iter 693/992 - loss 0.22245975 - time (sec): 15.68 - samples/sec: 7232.76 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:33:48,271 epoch 4 - iter 792/992 - loss 0.21974109 - time (sec): 18.00 - samples/sec: 7230.28 - lr: 0.000021 - momentum: 0.000000 2023-10-18 20:33:50,429 epoch 4 - iter 891/992 - loss 0.22008264 - time (sec): 20.16 - samples/sec: 7241.30 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:33:52,672 epoch 4 - iter 990/992 - loss 0.21727144 - time (sec): 22.41 - samples/sec: 7304.02 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:33:52,723 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:52,723 EPOCH 4 done: loss 0.2173 - lr: 0.000020 2023-10-18 20:33:54,524 DEV : loss 0.15653517842292786 - f1-score (micro avg) 0.4848 2023-10-18 20:33:54,542 saving best model 2023-10-18 20:33:54,578 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:33:56,710 epoch 5 - iter 99/992 - loss 0.24537712 - time (sec): 2.13 - samples/sec: 6996.16 - lr: 0.000020 - momentum: 0.000000 2023-10-18 20:33:58,940 epoch 5 - iter 198/992 - loss 0.21134377 - time (sec): 4.36 - samples/sec: 7459.07 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:34:01,175 epoch 5 - iter 297/992 - loss 0.20640811 - time (sec): 6.60 - samples/sec: 7433.04 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:34:03,545 epoch 5 - iter 396/992 - loss 0.20197952 - time (sec): 8.97 - samples/sec: 7362.38 - lr: 0.000019 - momentum: 0.000000 2023-10-18 20:34:05,772 epoch 5 - iter 495/992 - loss 0.20092622 - time (sec): 11.19 - samples/sec: 7287.58 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:34:08,064 epoch 5 - iter 594/992 - loss 0.20292860 - time (sec): 13.49 - samples/sec: 7292.89 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:34:10,381 epoch 5 - iter 693/992 - loss 0.20258902 - time (sec): 15.80 - samples/sec: 7276.17 - lr: 0.000018 - momentum: 0.000000 2023-10-18 20:34:12,621 epoch 5 - iter 792/992 - loss 0.20192145 - time (sec): 18.04 - samples/sec: 7294.30 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:34:14,841 epoch 5 - iter 891/992 - loss 0.20140955 - time (sec): 20.26 - samples/sec: 7293.58 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:34:17,121 epoch 5 - iter 990/992 - loss 0.20126076 - time (sec): 22.54 - samples/sec: 7259.31 - lr: 0.000017 - momentum: 0.000000 2023-10-18 20:34:17,169 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:34:17,169 EPOCH 5 done: loss 0.2012 - lr: 0.000017 2023-10-18 20:34:18,982 DEV : loss 0.1485081911087036 - f1-score (micro avg) 0.5227 2023-10-18 20:34:19,001 saving best model 2023-10-18 20:34:19,036 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:34:21,233 epoch 6 - iter 99/992 - loss 0.22430179 - time (sec): 2.20 - samples/sec: 7017.32 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:34:23,529 epoch 6 - iter 198/992 - loss 0.20436870 - time (sec): 4.49 - samples/sec: 7055.19 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:34:25,787 epoch 6 - iter 297/992 - loss 0.20029647 - time (sec): 6.75 - samples/sec: 7028.76 - lr: 0.000016 - momentum: 0.000000 2023-10-18 20:34:28,088 epoch 6 - iter 396/992 - loss 0.20256720 - time (sec): 9.05 - samples/sec: 7025.13 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:34:30,365 epoch 6 - iter 495/992 - loss 0.20317747 - time (sec): 11.33 - samples/sec: 7096.80 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:34:32,649 epoch 6 - iter 594/992 - loss 0.19651111 - time (sec): 13.61 - samples/sec: 7152.66 - lr: 0.000015 - momentum: 0.000000 2023-10-18 20:34:34,851 epoch 6 - iter 693/992 - loss 0.19268692 - time (sec): 15.81 - samples/sec: 7234.86 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:34:37,113 epoch 6 - iter 792/992 - loss 0.19172945 - time (sec): 18.08 - samples/sec: 7238.22 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:34:39,321 epoch 6 - iter 891/992 - loss 0.19360741 - time (sec): 20.28 - samples/sec: 7259.92 - lr: 0.000014 - momentum: 0.000000 2023-10-18 20:34:41,590 epoch 6 - iter 990/992 - loss 0.19064340 - time (sec): 22.55 - samples/sec: 7256.27 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:34:41,642 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:34:41,642 EPOCH 6 done: loss 0.1906 - lr: 0.000013 2023-10-18 20:34:43,465 DEV : loss 0.1433064043521881 - f1-score (micro avg) 0.5502 2023-10-18 20:34:43,483 saving best model 2023-10-18 20:34:43,516 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:34:45,840 epoch 7 - iter 99/992 - loss 0.21206962 - time (sec): 2.32 - samples/sec: 6893.32 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:34:48,076 epoch 7 - iter 198/992 - loss 0.19780535 - time (sec): 4.56 - samples/sec: 7095.06 - lr: 0.000013 - momentum: 0.000000 2023-10-18 20:34:50,388 epoch 7 - iter 297/992 - loss 0.19634963 - time (sec): 6.87 - samples/sec: 7052.69 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:34:52,686 epoch 7 - iter 396/992 - loss 0.18709449 - time (sec): 9.17 - samples/sec: 7163.54 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:34:54,903 epoch 7 - iter 495/992 - loss 0.18615423 - time (sec): 11.39 - samples/sec: 7245.35 - lr: 0.000012 - momentum: 0.000000 2023-10-18 20:34:57,221 epoch 7 - iter 594/992 - loss 0.18575283 - time (sec): 13.70 - samples/sec: 7174.74 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:34:59,457 epoch 7 - iter 693/992 - loss 0.18299791 - time (sec): 15.94 - samples/sec: 7216.08 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:35:01,674 epoch 7 - iter 792/992 - loss 0.18308143 - time (sec): 18.16 - samples/sec: 7207.35 - lr: 0.000011 - momentum: 0.000000 2023-10-18 20:35:03,934 epoch 7 - iter 891/992 - loss 0.18248990 - time (sec): 20.42 - samples/sec: 7196.20 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:35:06,151 epoch 7 - iter 990/992 - loss 0.18260769 - time (sec): 22.63 - samples/sec: 7235.83 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:35:06,190 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:06,190 EPOCH 7 done: loss 0.1826 - lr: 0.000010 2023-10-18 20:35:08,035 DEV : loss 0.13907533884048462 - f1-score (micro avg) 0.5623 2023-10-18 20:35:08,053 saving best model 2023-10-18 20:35:08,086 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:10,249 epoch 8 - iter 99/992 - loss 0.18323978 - time (sec): 2.16 - samples/sec: 7443.73 - lr: 0.000010 - momentum: 0.000000 2023-10-18 20:35:12,450 epoch 8 - iter 198/992 - loss 0.17828271 - time (sec): 4.36 - samples/sec: 7283.96 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:35:14,693 epoch 8 - iter 297/992 - loss 0.17876581 - time (sec): 6.61 - samples/sec: 7145.60 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:35:17,019 epoch 8 - iter 396/992 - loss 0.18045852 - time (sec): 8.93 - samples/sec: 7189.34 - lr: 0.000009 - momentum: 0.000000 2023-10-18 20:35:19,211 epoch 8 - iter 495/992 - loss 0.18102038 - time (sec): 11.12 - samples/sec: 7197.08 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:35:21,483 epoch 8 - iter 594/992 - loss 0.17757856 - time (sec): 13.40 - samples/sec: 7297.71 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:35:23,754 epoch 8 - iter 693/992 - loss 0.17685881 - time (sec): 15.67 - samples/sec: 7241.95 - lr: 0.000008 - momentum: 0.000000 2023-10-18 20:35:25,987 epoch 8 - iter 792/992 - loss 0.17740407 - time (sec): 17.90 - samples/sec: 7287.46 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:35:28,216 epoch 8 - iter 891/992 - loss 0.17815940 - time (sec): 20.13 - samples/sec: 7300.64 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:35:30,534 epoch 8 - iter 990/992 - loss 0.17824616 - time (sec): 22.45 - samples/sec: 7293.89 - lr: 0.000007 - momentum: 0.000000 2023-10-18 20:35:30,576 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:30,576 EPOCH 8 done: loss 0.1782 - lr: 0.000007 2023-10-18 20:35:32,760 DEV : loss 0.1372818648815155 - f1-score (micro avg) 0.5729 2023-10-18 20:35:32,778 saving best model 2023-10-18 20:35:32,810 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:35,075 epoch 9 - iter 99/992 - loss 0.15969196 - time (sec): 2.26 - samples/sec: 7173.67 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:35:37,180 epoch 9 - iter 198/992 - loss 0.17448311 - time (sec): 4.37 - samples/sec: 7508.59 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:35:39,362 epoch 9 - iter 297/992 - loss 0.17623452 - time (sec): 6.55 - samples/sec: 7368.61 - lr: 0.000006 - momentum: 0.000000 2023-10-18 20:35:41,713 epoch 9 - iter 396/992 - loss 0.17504733 - time (sec): 8.90 - samples/sec: 7178.40 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:35:43,965 epoch 9 - iter 495/992 - loss 0.17349129 - time (sec): 11.15 - samples/sec: 7243.56 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:35:46,194 epoch 9 - iter 594/992 - loss 0.17420767 - time (sec): 13.38 - samples/sec: 7235.24 - lr: 0.000005 - momentum: 0.000000 2023-10-18 20:35:48,470 epoch 9 - iter 693/992 - loss 0.17251140 - time (sec): 15.66 - samples/sec: 7298.69 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:35:50,625 epoch 9 - iter 792/992 - loss 0.17353050 - time (sec): 17.81 - samples/sec: 7306.77 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:35:52,833 epoch 9 - iter 891/992 - loss 0.17204365 - time (sec): 20.02 - samples/sec: 7322.75 - lr: 0.000004 - momentum: 0.000000 2023-10-18 20:35:55,115 epoch 9 - iter 990/992 - loss 0.17187736 - time (sec): 22.30 - samples/sec: 7339.19 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:35:55,161 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:55,161 EPOCH 9 done: loss 0.1719 - lr: 0.000003 2023-10-18 20:35:56,988 DEV : loss 0.13832369446754456 - f1-score (micro avg) 0.5708 2023-10-18 20:35:57,009 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:35:59,237 epoch 10 - iter 99/992 - loss 0.17701827 - time (sec): 2.23 - samples/sec: 7049.38 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:36:01,471 epoch 10 - iter 198/992 - loss 0.17977925 - time (sec): 4.46 - samples/sec: 7194.13 - lr: 0.000003 - momentum: 0.000000 2023-10-18 20:36:03,693 epoch 10 - iter 297/992 - loss 0.17530848 - time (sec): 6.68 - samples/sec: 7174.37 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:36:05,955 epoch 10 - iter 396/992 - loss 0.16854571 - time (sec): 8.95 - samples/sec: 7225.38 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:36:08,161 epoch 10 - iter 495/992 - loss 0.17158421 - time (sec): 11.15 - samples/sec: 7266.51 - lr: 0.000002 - momentum: 0.000000 2023-10-18 20:36:10,360 epoch 10 - iter 594/992 - loss 0.17275765 - time (sec): 13.35 - samples/sec: 7251.27 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:36:12,695 epoch 10 - iter 693/992 - loss 0.17130222 - time (sec): 15.69 - samples/sec: 7288.10 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:36:14,930 epoch 10 - iter 792/992 - loss 0.17016111 - time (sec): 17.92 - samples/sec: 7261.87 - lr: 0.000001 - momentum: 0.000000 2023-10-18 20:36:17,124 epoch 10 - iter 891/992 - loss 0.17025831 - time (sec): 20.11 - samples/sec: 7300.56 - lr: 0.000000 - momentum: 0.000000 2023-10-18 20:36:19,367 epoch 10 - iter 990/992 - loss 0.17132536 - time (sec): 22.36 - samples/sec: 7317.61 - lr: 0.000000 - momentum: 0.000000 2023-10-18 20:36:19,412 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:36:19,412 EPOCH 10 done: loss 0.1715 - lr: 0.000000 2023-10-18 20:36:21,225 DEV : loss 0.13621357083320618 - f1-score (micro avg) 0.5743 2023-10-18 20:36:21,244 saving best model 2023-10-18 20:36:21,307 ---------------------------------------------------------------------------------------------------- 2023-10-18 20:36:21,307 Loading model from best epoch ... 2023-10-18 20:36:21,389 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 20:36:22,927 Results: - F-score (micro) 0.5768 - F-score (macro) 0.3939 - Accuracy 0.4531 By class: precision recall f1-score support LOC 0.7394 0.6672 0.7014 655 PER 0.3516 0.6323 0.4519 223 ORG 0.1429 0.0157 0.0284 127 micro avg 0.5765 0.5771 0.5768 1005 macro avg 0.4113 0.4384 0.3939 1005 weighted avg 0.5780 0.5771 0.5610 1005 2023-10-18 20:36:22,927 ----------------------------------------------------------------------------------------------------