2023-10-18 21:54:33,271 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Train: 7936 sentences 2023-10-18 21:54:33,272 (train_with_dev=False, train_with_test=False) 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Training Params: 2023-10-18 21:54:33,272 - learning_rate: "5e-05" 2023-10-18 21:54:33,272 - mini_batch_size: "8" 2023-10-18 21:54:33,272 - max_epochs: "10" 2023-10-18 21:54:33,272 - shuffle: "True" 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Plugins: 2023-10-18 21:54:33,272 - TensorboardLogger 2023-10-18 21:54:33,272 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 21:54:33,272 - metric: "('micro avg', 'f1-score')" 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Computation: 2023-10-18 21:54:33,272 - compute on device: cuda:0 2023-10-18 21:54:33,272 - embedding storage: none 2023-10-18 21:54:33,272 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,272 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 21:54:33,273 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,273 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:33,273 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 21:54:35,500 epoch 1 - iter 99/992 - loss 3.11355291 - time (sec): 2.23 - samples/sec: 7666.30 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:54:37,678 epoch 1 - iter 198/992 - loss 2.79550926 - time (sec): 4.40 - samples/sec: 7440.64 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:54:39,905 epoch 1 - iter 297/992 - loss 2.26929089 - time (sec): 6.63 - samples/sec: 7505.99 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:54:42,204 epoch 1 - iter 396/992 - loss 1.82821963 - time (sec): 8.93 - samples/sec: 7490.72 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:54:44,416 epoch 1 - iter 495/992 - loss 1.57018564 - time (sec): 11.14 - samples/sec: 7467.81 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:54:46,645 epoch 1 - iter 594/992 - loss 1.38359360 - time (sec): 13.37 - samples/sec: 7441.53 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:54:48,875 epoch 1 - iter 693/992 - loss 1.23623179 - time (sec): 15.60 - samples/sec: 7446.06 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:54:51,050 epoch 1 - iter 792/992 - loss 1.12943382 - time (sec): 17.78 - samples/sec: 7407.13 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:54:53,334 epoch 1 - iter 891/992 - loss 1.04184352 - time (sec): 20.06 - samples/sec: 7358.15 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:54:55,537 epoch 1 - iter 990/992 - loss 0.97151767 - time (sec): 22.26 - samples/sec: 7352.94 - lr: 0.000050 - momentum: 0.000000 2023-10-18 21:54:55,582 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:55,582 EPOCH 1 done: loss 0.9704 - lr: 0.000050 2023-10-18 21:54:57,145 DEV : loss 0.21833859384059906 - f1-score (micro avg) 0.3255 2023-10-18 21:54:57,164 saving best model 2023-10-18 21:54:57,197 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:54:59,459 epoch 2 - iter 99/992 - loss 0.32706359 - time (sec): 2.26 - samples/sec: 7173.70 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:55:01,708 epoch 2 - iter 198/992 - loss 0.30563463 - time (sec): 4.51 - samples/sec: 7302.81 - lr: 0.000049 - momentum: 0.000000 2023-10-18 21:55:03,969 epoch 2 - iter 297/992 - loss 0.29429747 - time (sec): 6.77 - samples/sec: 7288.04 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:55:06,244 epoch 2 - iter 396/992 - loss 0.29148563 - time (sec): 9.05 - samples/sec: 7349.37 - lr: 0.000048 - momentum: 0.000000 2023-10-18 21:55:08,509 epoch 2 - iter 495/992 - loss 0.28801704 - time (sec): 11.31 - samples/sec: 7327.17 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:55:10,695 epoch 2 - iter 594/992 - loss 0.28735579 - time (sec): 13.50 - samples/sec: 7332.54 - lr: 0.000047 - momentum: 0.000000 2023-10-18 21:55:12,926 epoch 2 - iter 693/992 - loss 0.28460464 - time (sec): 15.73 - samples/sec: 7250.53 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:55:15,130 epoch 2 - iter 792/992 - loss 0.28520998 - time (sec): 17.93 - samples/sec: 7217.01 - lr: 0.000046 - momentum: 0.000000 2023-10-18 21:55:17,353 epoch 2 - iter 891/992 - loss 0.28026538 - time (sec): 20.16 - samples/sec: 7269.54 - lr: 0.000045 - momentum: 0.000000 2023-10-18 21:55:19,536 epoch 2 - iter 990/992 - loss 0.27503839 - time (sec): 22.34 - samples/sec: 7321.67 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:55:19,584 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:55:19,584 EPOCH 2 done: loss 0.2748 - lr: 0.000044 2023-10-18 21:55:21,774 DEV : loss 0.19050592184066772 - f1-score (micro avg) 0.3642 2023-10-18 21:55:21,794 saving best model 2023-10-18 21:55:21,829 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:55:24,022 epoch 3 - iter 99/992 - loss 0.24311763 - time (sec): 2.19 - samples/sec: 7302.08 - lr: 0.000044 - momentum: 0.000000 2023-10-18 21:55:26,230 epoch 3 - iter 198/992 - loss 0.23452324 - time (sec): 4.40 - samples/sec: 7287.89 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:55:28,453 epoch 3 - iter 297/992 - loss 0.22907305 - time (sec): 6.62 - samples/sec: 7223.86 - lr: 0.000043 - momentum: 0.000000 2023-10-18 21:55:30,594 epoch 3 - iter 396/992 - loss 0.23622014 - time (sec): 8.76 - samples/sec: 7324.50 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:55:32,575 epoch 3 - iter 495/992 - loss 0.23845281 - time (sec): 10.75 - samples/sec: 7506.47 - lr: 0.000042 - momentum: 0.000000 2023-10-18 21:55:34,775 epoch 3 - iter 594/992 - loss 0.23683125 - time (sec): 12.95 - samples/sec: 7508.86 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:55:36,956 epoch 3 - iter 693/992 - loss 0.23617397 - time (sec): 15.13 - samples/sec: 7515.06 - lr: 0.000041 - momentum: 0.000000 2023-10-18 21:55:39,166 epoch 3 - iter 792/992 - loss 0.23480189 - time (sec): 17.34 - samples/sec: 7516.39 - lr: 0.000040 - momentum: 0.000000 2023-10-18 21:55:41,385 epoch 3 - iter 891/992 - loss 0.23528113 - time (sec): 19.56 - samples/sec: 7487.53 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:55:43,685 epoch 3 - iter 990/992 - loss 0.23330811 - time (sec): 21.86 - samples/sec: 7491.34 - lr: 0.000039 - momentum: 0.000000 2023-10-18 21:55:43,736 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:55:43,736 EPOCH 3 done: loss 0.2337 - lr: 0.000039 2023-10-18 21:55:45,582 DEV : loss 0.1749068647623062 - f1-score (micro avg) 0.4182 2023-10-18 21:55:45,601 saving best model 2023-10-18 21:55:45,640 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:55:47,867 epoch 4 - iter 99/992 - loss 0.23209216 - time (sec): 2.23 - samples/sec: 7316.80 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:55:50,006 epoch 4 - iter 198/992 - loss 0.21943860 - time (sec): 4.37 - samples/sec: 7335.23 - lr: 0.000038 - momentum: 0.000000 2023-10-18 21:55:52,141 epoch 4 - iter 297/992 - loss 0.21580522 - time (sec): 6.50 - samples/sec: 7473.64 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:55:54,086 epoch 4 - iter 396/992 - loss 0.21521306 - time (sec): 8.44 - samples/sec: 7706.14 - lr: 0.000037 - momentum: 0.000000 2023-10-18 21:55:56,286 epoch 4 - iter 495/992 - loss 0.21303400 - time (sec): 10.65 - samples/sec: 7697.26 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:55:58,585 epoch 4 - iter 594/992 - loss 0.21386872 - time (sec): 12.94 - samples/sec: 7620.66 - lr: 0.000036 - momentum: 0.000000 2023-10-18 21:56:00,843 epoch 4 - iter 693/992 - loss 0.21292105 - time (sec): 15.20 - samples/sec: 7609.58 - lr: 0.000035 - momentum: 0.000000 2023-10-18 21:56:03,054 epoch 4 - iter 792/992 - loss 0.21167609 - time (sec): 17.41 - samples/sec: 7582.95 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:56:05,288 epoch 4 - iter 891/992 - loss 0.21036360 - time (sec): 19.65 - samples/sec: 7539.23 - lr: 0.000034 - momentum: 0.000000 2023-10-18 21:56:07,573 epoch 4 - iter 990/992 - loss 0.21009866 - time (sec): 21.93 - samples/sec: 7461.11 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:56:07,624 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:56:07,625 EPOCH 4 done: loss 0.2099 - lr: 0.000033 2023-10-18 21:56:09,454 DEV : loss 0.1589556187391281 - f1-score (micro avg) 0.4302 2023-10-18 21:56:09,473 saving best model 2023-10-18 21:56:09,507 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:56:11,681 epoch 5 - iter 99/992 - loss 0.21244283 - time (sec): 2.17 - samples/sec: 7120.21 - lr: 0.000033 - momentum: 0.000000 2023-10-18 21:56:13,909 epoch 5 - iter 198/992 - loss 0.20346866 - time (sec): 4.40 - samples/sec: 7143.14 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:56:16,179 epoch 5 - iter 297/992 - loss 0.19250101 - time (sec): 6.67 - samples/sec: 7157.94 - lr: 0.000032 - momentum: 0.000000 2023-10-18 21:56:18,394 epoch 5 - iter 396/992 - loss 0.19416248 - time (sec): 8.89 - samples/sec: 7174.94 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:56:20,709 epoch 5 - iter 495/992 - loss 0.19298598 - time (sec): 11.20 - samples/sec: 7166.12 - lr: 0.000031 - momentum: 0.000000 2023-10-18 21:56:22,990 epoch 5 - iter 594/992 - loss 0.19140411 - time (sec): 13.48 - samples/sec: 7227.29 - lr: 0.000030 - momentum: 0.000000 2023-10-18 21:56:25,289 epoch 5 - iter 693/992 - loss 0.19104653 - time (sec): 15.78 - samples/sec: 7234.47 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:56:27,581 epoch 5 - iter 792/992 - loss 0.19022986 - time (sec): 18.07 - samples/sec: 7240.62 - lr: 0.000029 - momentum: 0.000000 2023-10-18 21:56:29,785 epoch 5 - iter 891/992 - loss 0.18871481 - time (sec): 20.28 - samples/sec: 7294.24 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:56:31,990 epoch 5 - iter 990/992 - loss 0.19191013 - time (sec): 22.48 - samples/sec: 7277.88 - lr: 0.000028 - momentum: 0.000000 2023-10-18 21:56:32,037 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:56:32,037 EPOCH 5 done: loss 0.1918 - lr: 0.000028 2023-10-18 21:56:33,887 DEV : loss 0.15180285274982452 - f1-score (micro avg) 0.4646 2023-10-18 21:56:33,907 saving best model 2023-10-18 21:56:33,942 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:56:36,213 epoch 6 - iter 99/992 - loss 0.19371930 - time (sec): 2.27 - samples/sec: 7383.02 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:56:38,443 epoch 6 - iter 198/992 - loss 0.19060230 - time (sec): 4.50 - samples/sec: 7462.91 - lr: 0.000027 - momentum: 0.000000 2023-10-18 21:56:40,617 epoch 6 - iter 297/992 - loss 0.18894797 - time (sec): 6.67 - samples/sec: 7425.76 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:56:42,855 epoch 6 - iter 396/992 - loss 0.18244944 - time (sec): 8.91 - samples/sec: 7341.36 - lr: 0.000026 - momentum: 0.000000 2023-10-18 21:56:45,105 epoch 6 - iter 495/992 - loss 0.18110119 - time (sec): 11.16 - samples/sec: 7315.36 - lr: 0.000025 - momentum: 0.000000 2023-10-18 21:56:47,397 epoch 6 - iter 594/992 - loss 0.17861193 - time (sec): 13.45 - samples/sec: 7316.73 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:56:49,686 epoch 6 - iter 693/992 - loss 0.17844836 - time (sec): 15.74 - samples/sec: 7374.44 - lr: 0.000024 - momentum: 0.000000 2023-10-18 21:56:51,924 epoch 6 - iter 792/992 - loss 0.17957471 - time (sec): 17.98 - samples/sec: 7292.61 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:56:54,153 epoch 6 - iter 891/992 - loss 0.17926821 - time (sec): 20.21 - samples/sec: 7280.88 - lr: 0.000023 - momentum: 0.000000 2023-10-18 21:56:56,391 epoch 6 - iter 990/992 - loss 0.17924101 - time (sec): 22.45 - samples/sec: 7286.45 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:56:56,441 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:56:56,441 EPOCH 6 done: loss 0.1799 - lr: 0.000022 2023-10-18 21:56:58,301 DEV : loss 0.15175634622573853 - f1-score (micro avg) 0.4636 2023-10-18 21:56:58,320 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:57:00,560 epoch 7 - iter 99/992 - loss 0.17331896 - time (sec): 2.24 - samples/sec: 6979.09 - lr: 0.000022 - momentum: 0.000000 2023-10-18 21:57:02,905 epoch 7 - iter 198/992 - loss 0.17621820 - time (sec): 4.58 - samples/sec: 7191.86 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:57:05,242 epoch 7 - iter 297/992 - loss 0.17242925 - time (sec): 6.92 - samples/sec: 7159.28 - lr: 0.000021 - momentum: 0.000000 2023-10-18 21:57:07,579 epoch 7 - iter 396/992 - loss 0.16680316 - time (sec): 9.26 - samples/sec: 7077.62 - lr: 0.000020 - momentum: 0.000000 2023-10-18 21:57:09,886 epoch 7 - iter 495/992 - loss 0.16917314 - time (sec): 11.57 - samples/sec: 7049.19 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:57:12,114 epoch 7 - iter 594/992 - loss 0.16845160 - time (sec): 13.79 - samples/sec: 7097.60 - lr: 0.000019 - momentum: 0.000000 2023-10-18 21:57:14,364 epoch 7 - iter 693/992 - loss 0.16833284 - time (sec): 16.04 - samples/sec: 7120.30 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:57:16,729 epoch 7 - iter 792/992 - loss 0.16816329 - time (sec): 18.41 - samples/sec: 7189.87 - lr: 0.000018 - momentum: 0.000000 2023-10-18 21:57:18,914 epoch 7 - iter 891/992 - loss 0.16870231 - time (sec): 20.59 - samples/sec: 7207.37 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:57:21,088 epoch 7 - iter 990/992 - loss 0.17120803 - time (sec): 22.77 - samples/sec: 7184.77 - lr: 0.000017 - momentum: 0.000000 2023-10-18 21:57:21,133 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:57:21,133 EPOCH 7 done: loss 0.1710 - lr: 0.000017 2023-10-18 21:57:23,365 DEV : loss 0.14771050214767456 - f1-score (micro avg) 0.4614 2023-10-18 21:57:23,384 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:57:25,576 epoch 8 - iter 99/992 - loss 0.16145510 - time (sec): 2.19 - samples/sec: 8094.49 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:57:27,798 epoch 8 - iter 198/992 - loss 0.16238181 - time (sec): 4.41 - samples/sec: 7801.61 - lr: 0.000016 - momentum: 0.000000 2023-10-18 21:57:30,048 epoch 8 - iter 297/992 - loss 0.16042643 - time (sec): 6.66 - samples/sec: 7743.08 - lr: 0.000015 - momentum: 0.000000 2023-10-18 21:57:32,312 epoch 8 - iter 396/992 - loss 0.16123392 - time (sec): 8.93 - samples/sec: 7716.08 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:57:34,474 epoch 8 - iter 495/992 - loss 0.15882526 - time (sec): 11.09 - samples/sec: 7568.87 - lr: 0.000014 - momentum: 0.000000 2023-10-18 21:57:36,702 epoch 8 - iter 594/992 - loss 0.15852308 - time (sec): 13.32 - samples/sec: 7545.54 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:57:38,933 epoch 8 - iter 693/992 - loss 0.16090394 - time (sec): 15.55 - samples/sec: 7490.79 - lr: 0.000013 - momentum: 0.000000 2023-10-18 21:57:41,122 epoch 8 - iter 792/992 - loss 0.16162584 - time (sec): 17.74 - samples/sec: 7465.42 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:57:43,302 epoch 8 - iter 891/992 - loss 0.16350113 - time (sec): 19.92 - samples/sec: 7410.15 - lr: 0.000012 - momentum: 0.000000 2023-10-18 21:57:45,570 epoch 8 - iter 990/992 - loss 0.16534095 - time (sec): 22.19 - samples/sec: 7377.71 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:57:45,617 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:57:45,617 EPOCH 8 done: loss 0.1653 - lr: 0.000011 2023-10-18 21:57:47,461 DEV : loss 0.15022730827331543 - f1-score (micro avg) 0.4734 2023-10-18 21:57:47,483 saving best model 2023-10-18 21:57:47,522 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:57:49,845 epoch 9 - iter 99/992 - loss 0.15429276 - time (sec): 2.32 - samples/sec: 7144.31 - lr: 0.000011 - momentum: 0.000000 2023-10-18 21:57:52,069 epoch 9 - iter 198/992 - loss 0.15976108 - time (sec): 4.55 - samples/sec: 7389.48 - lr: 0.000010 - momentum: 0.000000 2023-10-18 21:57:54,419 epoch 9 - iter 297/992 - loss 0.16814053 - time (sec): 6.90 - samples/sec: 7414.35 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:57:56,610 epoch 9 - iter 396/992 - loss 0.16893864 - time (sec): 9.09 - samples/sec: 7402.74 - lr: 0.000009 - momentum: 0.000000 2023-10-18 21:57:58,842 epoch 9 - iter 495/992 - loss 0.16480279 - time (sec): 11.32 - samples/sec: 7368.59 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:58:01,049 epoch 9 - iter 594/992 - loss 0.16380758 - time (sec): 13.53 - samples/sec: 7331.97 - lr: 0.000008 - momentum: 0.000000 2023-10-18 21:58:03,242 epoch 9 - iter 693/992 - loss 0.16219906 - time (sec): 15.72 - samples/sec: 7370.67 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:58:05,443 epoch 9 - iter 792/992 - loss 0.16449280 - time (sec): 17.92 - samples/sec: 7342.15 - lr: 0.000007 - momentum: 0.000000 2023-10-18 21:58:07,702 epoch 9 - iter 891/992 - loss 0.16241771 - time (sec): 20.18 - samples/sec: 7314.62 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:58:09,923 epoch 9 - iter 990/992 - loss 0.15995791 - time (sec): 22.40 - samples/sec: 7312.03 - lr: 0.000006 - momentum: 0.000000 2023-10-18 21:58:09,963 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:58:09,964 EPOCH 9 done: loss 0.1599 - lr: 0.000006 2023-10-18 21:58:11,791 DEV : loss 0.15096156299114227 - f1-score (micro avg) 0.4811 2023-10-18 21:58:11,811 saving best model 2023-10-18 21:58:11,847 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:58:14,177 epoch 10 - iter 99/992 - loss 0.16010106 - time (sec): 2.33 - samples/sec: 6908.82 - lr: 0.000005 - momentum: 0.000000 2023-10-18 21:58:16,389 epoch 10 - iter 198/992 - loss 0.14789645 - time (sec): 4.54 - samples/sec: 7237.38 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:58:18,701 epoch 10 - iter 297/992 - loss 0.14655674 - time (sec): 6.85 - samples/sec: 7318.07 - lr: 0.000004 - momentum: 0.000000 2023-10-18 21:58:20,992 epoch 10 - iter 396/992 - loss 0.15028578 - time (sec): 9.14 - samples/sec: 7188.41 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:58:23,161 epoch 10 - iter 495/992 - loss 0.15310770 - time (sec): 11.31 - samples/sec: 7192.41 - lr: 0.000003 - momentum: 0.000000 2023-10-18 21:58:25,426 epoch 10 - iter 594/992 - loss 0.15629795 - time (sec): 13.58 - samples/sec: 7213.73 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:58:27,654 epoch 10 - iter 693/992 - loss 0.15763389 - time (sec): 15.81 - samples/sec: 7241.31 - lr: 0.000002 - momentum: 0.000000 2023-10-18 21:58:29,889 epoch 10 - iter 792/992 - loss 0.15858926 - time (sec): 18.04 - samples/sec: 7295.51 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:58:32,062 epoch 10 - iter 891/992 - loss 0.15834786 - time (sec): 20.21 - samples/sec: 7287.22 - lr: 0.000001 - momentum: 0.000000 2023-10-18 21:58:34,265 epoch 10 - iter 990/992 - loss 0.15652090 - time (sec): 22.42 - samples/sec: 7302.97 - lr: 0.000000 - momentum: 0.000000 2023-10-18 21:58:34,312 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:58:34,312 EPOCH 10 done: loss 0.1566 - lr: 0.000000 2023-10-18 21:58:36,128 DEV : loss 0.1495286077260971 - f1-score (micro avg) 0.4881 2023-10-18 21:58:36,146 saving best model 2023-10-18 21:58:36,204 ---------------------------------------------------------------------------------------------------- 2023-10-18 21:58:36,204 Loading model from best epoch ... 2023-10-18 21:58:36,283 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-18 21:58:37,756 Results: - F-score (micro) 0.542 - F-score (macro) 0.3671 - Accuracy 0.4099 By class: precision recall f1-score support LOC 0.7039 0.6824 0.6930 655 PER 0.2770 0.5067 0.3582 223 ORG 0.1212 0.0315 0.0500 127 micro avg 0.5242 0.5612 0.5420 1005 macro avg 0.3674 0.4069 0.3671 1005 weighted avg 0.5356 0.5612 0.5375 1005 2023-10-18 21:58:37,756 ----------------------------------------------------------------------------------------------------