2023-10-18 21:44:50,558 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Train: 7936 sentences 2023-10-18 21:44:50,559 (train_with_dev=False, train_with_test=False) 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Training Params: 2023-10-18 21:44:50,559 - learning_rate: "5e-05" 2023-10-18 21:44:50,559 - mini_batch_size: "4" 2023-10-18 21:44:50,559 - max_epochs: "10" 2023-10-18 21:44:50,559 - shuffle: "True" 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Plugins: 2023-10-18 21:44:50,559 - TensorboardLogger 2023-10-18 21:44:50,559 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 21:44:50,559 - metric: "('micro avg', 'f1-score')" 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,559 Computation: 2023-10-18 21:44:50,559 - compute on device: cuda:0 2023-10-18 21:44:50,559 - embedding storage: none 2023-10-18 21:44:50,559 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,560 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 21:44:50,560 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,560 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:44:50,560 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 21:44:53,702 epoch 1 - iter 198/1984 - loss 3.02755835 - time (sec): 3.14 - samples/sec: 5433.82 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:44:56,723 epoch 1 - iter 396/1984 - loss 2.50705750 - time (sec): 6.16 - samples/sec: 5318.29 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:44:59,776 epoch 1 - iter 594/1984 - loss 1.89414489 - time (sec): 9.22 - samples/sec: 5401.23 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:45:02,860 epoch 1 - iter 792/1984 - loss 1.52591738 - time (sec): 12.30 - samples/sec: 5438.96 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:45:05,884 epoch 1 - iter 990/1984 - loss 1.32097567 - time (sec): 15.32 - samples/sec: 5430.32 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:45:08,933 epoch 1 - iter 1188/1984 - loss 1.16575852 - time (sec): 18.37 - samples/sec: 5416.07 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:45:11,986 epoch 1 - iter 1386/1984 - loss 1.04445703 - time (sec): 21.43 - samples/sec: 5421.99 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:45:15,011 epoch 1 - iter 1584/1984 - loss 0.95783838 - time (sec): 24.45 - samples/sec: 5385.50 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:45:18,008 epoch 1 - iter 1782/1984 - loss 0.88791048 - time (sec): 27.45 - samples/sec: 5377.95 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:45:21,038 epoch 1 - iter 1980/1984 - loss 0.83155339 - time (sec): 30.48 - samples/sec: 5371.39 - lr: 0.000050 - momentum: 0.000000 2023-10-18 21:45:21,095 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:45:21,095 EPOCH 1 done: loss 0.8309 - lr: 0.000050 2023-10-18 21:45:22,631 DEV : loss 0.21765930950641632 - f1-score (micro avg) 0.3337 2023-10-18 21:45:22,649 saving best model 2023-10-18 21:45:22,682 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:45:25,694 epoch 2 - iter 198/1984 - loss 0.31683963 - time (sec): 3.01 - samples/sec: 5386.59 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:45:28,770 epoch 2 - iter 396/1984 - loss 0.29456292 - time (sec): 6.09 - samples/sec: 5410.37 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:45:31,796 epoch 2 - iter 594/1984 - loss 0.28745860 - time (sec): 9.11 - samples/sec: 5414.94 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:45:34,843 epoch 2 - iter 792/1984 - loss 0.28332924 - time (sec): 12.16 - samples/sec: 5467.12 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:45:37,882 epoch 2 - iter 990/1984 - loss 0.27820558 - time (sec): 15.20 - samples/sec: 5452.84 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:45:40,914 epoch 2 - iter 1188/1984 - loss 0.27692060 - time (sec): 18.23 - samples/sec: 5428.66 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:45:43,924 epoch 2 - iter 1386/1984 - loss 0.27338747 - time (sec): 21.24 - samples/sec: 5368.92 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:45:46,946 epoch 2 - iter 1584/1984 - loss 0.27329556 - time (sec): 24.26 - samples/sec: 5333.65 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:45:50,005 epoch 2 - iter 1782/1984 - loss 0.26872899 - time (sec): 27.32 - samples/sec: 5362.45 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:45:53,066 epoch 2 - iter 1980/1984 - loss 0.26392738 - time (sec): 30.38 - samples/sec: 5383.14 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:45:53,129 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:45:53,129 EPOCH 2 done: loss 0.2637 - lr: 0.000044 2023-10-18 21:45:54,965 DEV : loss 0.18782073259353638 - f1-score (micro avg) 0.3894 2023-10-18 21:45:54,983 saving best model 2023-10-18 21:45:55,015 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:45:57,994 epoch 3 - iter 198/1984 - loss 0.23184493 - time (sec): 2.98 - samples/sec: 5373.79 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:46:00,989 epoch 3 - iter 396/1984 - loss 0.22422770 - time (sec): 5.97 - samples/sec: 5368.69 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:46:03,985 epoch 3 - iter 594/1984 - loss 0.21842906 - time (sec): 8.97 - samples/sec: 5335.00 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:46:07,452 epoch 3 - iter 792/1984 - loss 0.22427957 - time (sec): 12.44 - samples/sec: 5161.76 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:46:10,251 epoch 3 - iter 990/1984 - loss 0.22526202 - time (sec): 15.23 - samples/sec: 5294.55 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:46:13,311 epoch 3 - iter 1188/1984 - loss 0.22436343 - time (sec): 18.29 - samples/sec: 5313.27 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:46:16,377 epoch 3 - iter 1386/1984 - loss 0.22350554 - time (sec): 21.36 - samples/sec: 5321.49 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:46:19,427 epoch 3 - iter 1584/1984 - loss 0.22239930 - time (sec): 24.41 - samples/sec: 5338.02 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:46:22,480 epoch 3 - iter 1782/1984 - loss 0.22294790 - time (sec): 27.46 - samples/sec: 5331.42 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:46:25,656 epoch 3 - iter 1980/1984 - loss 0.21981448 - time (sec): 30.64 - samples/sec: 5343.64 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:46:25,714 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:46:25,714 EPOCH 3 done: loss 0.2202 - lr: 0.000039 2023-10-18 21:46:27,536 DEV : loss 0.16426852345466614 - f1-score (micro avg) 0.4378 2023-10-18 21:46:27,556 saving best model 2023-10-18 21:46:27,591 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:46:30,645 epoch 4 - iter 198/1984 - loss 0.21821162 - time (sec): 3.05 - samples/sec: 5334.38 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:46:33,608 epoch 4 - iter 396/1984 - loss 0.20572668 - time (sec): 6.02 - samples/sec: 5322.24 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:46:36,639 epoch 4 - iter 594/1984 - loss 0.20384467 - time (sec): 9.05 - samples/sec: 5369.64 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:46:39,741 epoch 4 - iter 792/1984 - loss 0.20226613 - time (sec): 12.15 - samples/sec: 5356.48 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:46:42,759 epoch 4 - iter 990/1984 - loss 0.20003297 - time (sec): 15.17 - samples/sec: 5402.33 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:46:45,772 epoch 4 - iter 1188/1984 - loss 0.20039277 - time (sec): 18.18 - samples/sec: 5425.68 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:46:48,837 epoch 4 - iter 1386/1984 - loss 0.20002361 - time (sec): 21.25 - samples/sec: 5445.13 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:46:51,873 epoch 4 - iter 1584/1984 - loss 0.19776710 - time (sec): 24.28 - samples/sec: 5438.11 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:46:54,901 epoch 4 - iter 1782/1984 - loss 0.19638486 - time (sec): 27.31 - samples/sec: 5423.82 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:46:57,951 epoch 4 - iter 1980/1984 - loss 0.19597838 - time (sec): 30.36 - samples/sec: 5390.08 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:46:58,013 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:46:58,013 EPOCH 4 done: loss 0.1958 - lr: 0.000033 2023-10-18 21:46:59,836 DEV : loss 0.1590408831834793 - f1-score (micro avg) 0.4645 2023-10-18 21:46:59,855 saving best model 2023-10-18 21:46:59,889 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:47:02,987 epoch 5 - iter 198/1984 - loss 0.19862903 - time (sec): 3.10 - samples/sec: 4997.21 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:47:06,060 epoch 5 - iter 396/1984 - loss 0.19073285 - time (sec): 6.17 - samples/sec: 5095.42 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:47:09,118 epoch 5 - iter 594/1984 - loss 0.18017645 - time (sec): 9.23 - samples/sec: 5174.83 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:47:12,192 epoch 5 - iter 792/1984 - loss 0.18150575 - time (sec): 12.30 - samples/sec: 5182.82 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:47:15,250 epoch 5 - iter 990/1984 - loss 0.18132472 - time (sec): 15.36 - samples/sec: 5225.93 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:47:18,322 epoch 5 - iter 1188/1984 - loss 0.17910642 - time (sec): 18.43 - samples/sec: 5286.47 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:47:21,373 epoch 5 - iter 1386/1984 - loss 0.17861652 - time (sec): 21.48 - samples/sec: 5314.48 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:47:24,445 epoch 5 - iter 1584/1984 - loss 0.17851281 - time (sec): 24.56 - samples/sec: 5329.37 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:47:27,499 epoch 5 - iter 1782/1984 - loss 0.17681625 - time (sec): 27.61 - samples/sec: 5357.24 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:47:30,515 epoch 5 - iter 1980/1984 - loss 0.17933583 - time (sec): 30.62 - samples/sec: 5342.88 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:47:30,583 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:47:30,583 EPOCH 5 done: loss 0.1792 - lr: 0.000028 2023-10-18 21:47:32,431 DEV : loss 0.15715673565864563 - f1-score (micro avg) 0.5065 2023-10-18 21:47:32,449 saving best model 2023-10-18 21:47:32,487 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:47:35,526 epoch 6 - iter 198/1984 - loss 0.18455211 - time (sec): 3.04 - samples/sec: 5515.02 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:47:38,795 epoch 6 - iter 396/1984 - loss 0.18436525 - time (sec): 6.31 - samples/sec: 5325.15 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:47:41,821 epoch 6 - iter 594/1984 - loss 0.17975805 - time (sec): 9.33 - samples/sec: 5309.96 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:47:44,832 epoch 6 - iter 792/1984 - loss 0.17157674 - time (sec): 12.34 - samples/sec: 5300.16 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:47:47,848 epoch 6 - iter 990/1984 - loss 0.17078466 - time (sec): 15.36 - samples/sec: 5315.70 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:47:50,932 epoch 6 - iter 1188/1984 - loss 0.16688998 - time (sec): 18.44 - samples/sec: 5337.14 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:47:54,002 epoch 6 - iter 1386/1984 - loss 0.16587726 - time (sec): 21.51 - samples/sec: 5396.04 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:47:57,025 epoch 6 - iter 1584/1984 - loss 0.16677262 - time (sec): 24.54 - samples/sec: 5343.91 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:48:00,020 epoch 6 - iter 1782/1984 - loss 0.16683507 - time (sec): 27.53 - samples/sec: 5344.66 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:48:03,006 epoch 6 - iter 1980/1984 - loss 0.16690010 - time (sec): 30.52 - samples/sec: 5359.47 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:48:03,067 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:48:03,067 EPOCH 6 done: loss 0.1674 - lr: 0.000022 2023-10-18 21:48:04,904 DEV : loss 0.15758880972862244 - f1-score (micro avg) 0.5325 2023-10-18 21:48:04,922 saving best model 2023-10-18 21:48:04,961 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:48:07,977 epoch 7 - iter 198/1984 - loss 0.15459941 - time (sec): 3.02 - samples/sec: 5183.56 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:48:10,991 epoch 7 - iter 396/1984 - loss 0.16073533 - time (sec): 6.03 - samples/sec: 5469.06 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:48:14,051 epoch 7 - iter 594/1984 - loss 0.15792323 - time (sec): 9.09 - samples/sec: 5452.10 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:48:17,116 epoch 7 - iter 792/1984 - loss 0.15303565 - time (sec): 12.15 - samples/sec: 5391.67 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:48:20,201 epoch 7 - iter 990/1984 - loss 0.15501183 - time (sec): 15.24 - samples/sec: 5350.01 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:48:23,341 epoch 7 - iter 1188/1984 - loss 0.15432518 - time (sec): 18.38 - samples/sec: 5327.01 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:48:26,378 epoch 7 - iter 1386/1984 - loss 0.15354368 - time (sec): 21.42 - samples/sec: 5334.08 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:48:29,540 epoch 7 - iter 1584/1984 - loss 0.15398265 - time (sec): 24.58 - samples/sec: 5384.87 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:48:32,576 epoch 7 - iter 1782/1984 - loss 0.15459399 - time (sec): 27.61 - samples/sec: 5374.90 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:48:35,610 epoch 7 - iter 1980/1984 - loss 0.15754883 - time (sec): 30.65 - samples/sec: 5337.49 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:48:35,671 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:48:35,671 EPOCH 7 done: loss 0.1573 - lr: 0.000017 2023-10-18 21:48:37,508 DEV : loss 0.15788982808589935 - f1-score (micro avg) 0.5569 2023-10-18 21:48:37,527 saving best model 2023-10-18 21:48:37,561 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:48:40,596 epoch 8 - iter 198/1984 - loss 0.14924294 - time (sec): 3.03 - samples/sec: 5847.53 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:48:43,640 epoch 8 - iter 396/1984 - loss 0.14667895 - time (sec): 6.08 - samples/sec: 5664.45 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:48:46,622 epoch 8 - iter 594/1984 - loss 0.14204086 - time (sec): 9.06 - samples/sec: 5694.55 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:48:49,614 epoch 8 - iter 792/1984 - loss 0.14280278 - time (sec): 12.05 - samples/sec: 5715.33 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:48:52,626 epoch 8 - iter 990/1984 - loss 0.14084806 - time (sec): 15.06 - samples/sec: 5572.09 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:48:55,678 epoch 8 - iter 1188/1984 - loss 0.14241472 - time (sec): 18.12 - samples/sec: 5546.88 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:48:58,761 epoch 8 - iter 1386/1984 - loss 0.14358925 - time (sec): 21.20 - samples/sec: 5494.28 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:49:01,761 epoch 8 - iter 1584/1984 - loss 0.14471371 - time (sec): 24.20 - samples/sec: 5472.04 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:49:04,782 epoch 8 - iter 1782/1984 - loss 0.14729854 - time (sec): 27.22 - samples/sec: 5422.32 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:49:07,819 epoch 8 - iter 1980/1984 - loss 0.14902796 - time (sec): 30.26 - samples/sec: 5409.51 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:49:07,876 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:49:07,876 EPOCH 8 done: loss 0.1490 - lr: 0.000011 2023-10-18 21:49:10,078 DEV : loss 0.15831266343593597 - f1-score (micro avg) 0.5582 2023-10-18 21:49:10,097 saving best model 2023-10-18 21:49:10,130 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:49:13,217 epoch 9 - iter 198/1984 - loss 0.13787577 - time (sec): 3.09 - samples/sec: 5374.53 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:49:16,276 epoch 9 - iter 396/1984 - loss 0.14301758 - time (sec): 6.15 - samples/sec: 5466.73 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:49:19,440 epoch 9 - iter 594/1984 - loss 0.14924873 - time (sec): 9.31 - samples/sec: 5493.05 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:49:22,490 epoch 9 - iter 792/1984 - loss 0.15082222 - time (sec): 12.36 - samples/sec: 5442.31 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:49:25,512 epoch 9 - iter 990/1984 - loss 0.14676994 - time (sec): 15.38 - samples/sec: 5422.54 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:49:28,365 epoch 9 - iter 1188/1984 - loss 0.14564122 - time (sec): 18.23 - samples/sec: 5438.66 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:49:31,146 epoch 9 - iter 1386/1984 - loss 0.14469641 - time (sec): 21.02 - samples/sec: 5512.99 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:49:34,194 epoch 9 - iter 1584/1984 - loss 0.14740619 - time (sec): 24.06 - samples/sec: 5467.90 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:49:37,127 epoch 9 - iter 1782/1984 - loss 0.14524676 - time (sec): 27.00 - samples/sec: 5467.48 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:49:40,115 epoch 9 - iter 1980/1984 - loss 0.14330917 - time (sec): 29.98 - samples/sec: 5462.56 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:49:40,171 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:49:40,171 EPOCH 9 done: loss 0.1432 - lr: 0.000006 2023-10-18 21:49:41,979 DEV : loss 0.16277986764907837 - f1-score (micro avg) 0.5731 2023-10-18 21:49:41,998 saving best model 2023-10-18 21:49:42,031 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:49:45,068 epoch 10 - iter 198/1984 - loss 0.14623963 - time (sec): 3.04 - samples/sec: 5299.69 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:49:48,152 epoch 10 - iter 396/1984 - loss 0.13592264 - time (sec): 6.12 - samples/sec: 5369.93 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:49:51,235 epoch 10 - iter 594/1984 - loss 0.13273236 - time (sec): 9.20 - samples/sec: 5449.73 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:49:54,260 epoch 10 - iter 792/1984 - loss 0.13486998 - time (sec): 12.23 - samples/sec: 5375.78 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:49:57,283 epoch 10 - iter 990/1984 - loss 0.13630045 - time (sec): 15.25 - samples/sec: 5335.35 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:50:00,321 epoch 10 - iter 1188/1984 - loss 0.13938042 - time (sec): 18.29 - samples/sec: 5355.77 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:50:03,337 epoch 10 - iter 1386/1984 - loss 0.14082238 - time (sec): 21.30 - samples/sec: 5372.41 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:50:06,512 epoch 10 - iter 1584/1984 - loss 0.14201225 - time (sec): 24.48 - samples/sec: 5376.70 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:50:09,464 epoch 10 - iter 1782/1984 - loss 0.14275124 - time (sec): 27.43 - samples/sec: 5369.93 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:50:12,545 epoch 10 - iter 1980/1984 - loss 0.14099621 - time (sec): 30.51 - samples/sec: 5365.36 - lr: 0.000000 - momentum: 0.000000 2023-10-18 21:50:12,606 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:50:12,606 EPOCH 10 done: loss 0.1412 - lr: 0.000000 2023-10-18 21:50:14,448 DEV : loss 0.1613704115152359 - f1-score (micro avg) 0.5714 2023-10-18 21:50:14,498 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:50:14,498 Loading model from best epoch ... 2023-10-18 21:50:14,584 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 21:50:16,122 Results: - F-score (micro) 0.579 - F-score (macro) 0.4219 - Accuracy 0.4471 By class: precision recall f1-score support LOC 0.7049 0.7038 0.7044 655 PER 0.3574 0.5336 0.4281 223 ORG 0.2264 0.0945 0.1333 127 micro avg 0.5692 0.5891 0.5790 1005 macro avg 0.4296 0.4440 0.4219 1005 weighted avg 0.5673 0.5891 0.5709 1005 2023-10-18 21:50:16,122 ----------------------------------------------------------------------------------------------------