stefan-it's picture
Upload folder using huggingface_hub
fe7e39d
2023-10-18 21:54:33,271 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Train: 7936 sentences
2023-10-18 21:54:33,272 (train_with_dev=False, train_with_test=False)
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Training Params:
2023-10-18 21:54:33,272 - learning_rate: "5e-05"
2023-10-18 21:54:33,272 - mini_batch_size: "8"
2023-10-18 21:54:33,272 - max_epochs: "10"
2023-10-18 21:54:33,272 - shuffle: "True"
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Plugins:
2023-10-18 21:54:33,272 - TensorboardLogger
2023-10-18 21:54:33,272 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 21:54:33,272 - metric: "('micro avg', 'f1-score')"
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Computation:
2023-10-18 21:54:33,272 - compute on device: cuda:0
2023-10-18 21:54:33,272 - embedding storage: none
2023-10-18 21:54:33,272 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,272 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 21:54:33,273 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,273 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:33,273 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 21:54:35,500 epoch 1 - iter 99/992 - loss 3.11355291 - time (sec): 2.23 - samples/sec: 7666.30 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:54:37,678 epoch 1 - iter 198/992 - loss 2.79550926 - time (sec): 4.40 - samples/sec: 7440.64 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:54:39,905 epoch 1 - iter 297/992 - loss 2.26929089 - time (sec): 6.63 - samples/sec: 7505.99 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:54:42,204 epoch 1 - iter 396/992 - loss 1.82821963 - time (sec): 8.93 - samples/sec: 7490.72 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:54:44,416 epoch 1 - iter 495/992 - loss 1.57018564 - time (sec): 11.14 - samples/sec: 7467.81 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:54:46,645 epoch 1 - iter 594/992 - loss 1.38359360 - time (sec): 13.37 - samples/sec: 7441.53 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:54:48,875 epoch 1 - iter 693/992 - loss 1.23623179 - time (sec): 15.60 - samples/sec: 7446.06 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:54:51,050 epoch 1 - iter 792/992 - loss 1.12943382 - time (sec): 17.78 - samples/sec: 7407.13 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:54:53,334 epoch 1 - iter 891/992 - loss 1.04184352 - time (sec): 20.06 - samples/sec: 7358.15 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:54:55,537 epoch 1 - iter 990/992 - loss 0.97151767 - time (sec): 22.26 - samples/sec: 7352.94 - lr: 0.000050 - momentum: 0.000000
2023-10-18 21:54:55,582 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:55,582 EPOCH 1 done: loss 0.9704 - lr: 0.000050
2023-10-18 21:54:57,145 DEV : loss 0.21833859384059906 - f1-score (micro avg) 0.3255
2023-10-18 21:54:57,164 saving best model
2023-10-18 21:54:57,197 ----------------------------------------------------------------------------------------------------
2023-10-18 21:54:59,459 epoch 2 - iter 99/992 - loss 0.32706359 - time (sec): 2.26 - samples/sec: 7173.70 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:55:01,708 epoch 2 - iter 198/992 - loss 0.30563463 - time (sec): 4.51 - samples/sec: 7302.81 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:55:03,969 epoch 2 - iter 297/992 - loss 0.29429747 - time (sec): 6.77 - samples/sec: 7288.04 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:55:06,244 epoch 2 - iter 396/992 - loss 0.29148563 - time (sec): 9.05 - samples/sec: 7349.37 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:55:08,509 epoch 2 - iter 495/992 - loss 0.28801704 - time (sec): 11.31 - samples/sec: 7327.17 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:55:10,695 epoch 2 - iter 594/992 - loss 0.28735579 - time (sec): 13.50 - samples/sec: 7332.54 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:55:12,926 epoch 2 - iter 693/992 - loss 0.28460464 - time (sec): 15.73 - samples/sec: 7250.53 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:55:15,130 epoch 2 - iter 792/992 - loss 0.28520998 - time (sec): 17.93 - samples/sec: 7217.01 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:55:17,353 epoch 2 - iter 891/992 - loss 0.28026538 - time (sec): 20.16 - samples/sec: 7269.54 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:55:19,536 epoch 2 - iter 990/992 - loss 0.27503839 - time (sec): 22.34 - samples/sec: 7321.67 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:55:19,584 ----------------------------------------------------------------------------------------------------
2023-10-18 21:55:19,584 EPOCH 2 done: loss 0.2748 - lr: 0.000044
2023-10-18 21:55:21,774 DEV : loss 0.19050592184066772 - f1-score (micro avg) 0.3642
2023-10-18 21:55:21,794 saving best model
2023-10-18 21:55:21,829 ----------------------------------------------------------------------------------------------------
2023-10-18 21:55:24,022 epoch 3 - iter 99/992 - loss 0.24311763 - time (sec): 2.19 - samples/sec: 7302.08 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:55:26,230 epoch 3 - iter 198/992 - loss 0.23452324 - time (sec): 4.40 - samples/sec: 7287.89 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:55:28,453 epoch 3 - iter 297/992 - loss 0.22907305 - time (sec): 6.62 - samples/sec: 7223.86 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:55:30,594 epoch 3 - iter 396/992 - loss 0.23622014 - time (sec): 8.76 - samples/sec: 7324.50 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:55:32,575 epoch 3 - iter 495/992 - loss 0.23845281 - time (sec): 10.75 - samples/sec: 7506.47 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:55:34,775 epoch 3 - iter 594/992 - loss 0.23683125 - time (sec): 12.95 - samples/sec: 7508.86 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:55:36,956 epoch 3 - iter 693/992 - loss 0.23617397 - time (sec): 15.13 - samples/sec: 7515.06 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:55:39,166 epoch 3 - iter 792/992 - loss 0.23480189 - time (sec): 17.34 - samples/sec: 7516.39 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:55:41,385 epoch 3 - iter 891/992 - loss 0.23528113 - time (sec): 19.56 - samples/sec: 7487.53 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:55:43,685 epoch 3 - iter 990/992 - loss 0.23330811 - time (sec): 21.86 - samples/sec: 7491.34 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:55:43,736 ----------------------------------------------------------------------------------------------------
2023-10-18 21:55:43,736 EPOCH 3 done: loss 0.2337 - lr: 0.000039
2023-10-18 21:55:45,582 DEV : loss 0.1749068647623062 - f1-score (micro avg) 0.4182
2023-10-18 21:55:45,601 saving best model
2023-10-18 21:55:45,640 ----------------------------------------------------------------------------------------------------
2023-10-18 21:55:47,867 epoch 4 - iter 99/992 - loss 0.23209216 - time (sec): 2.23 - samples/sec: 7316.80 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:55:50,006 epoch 4 - iter 198/992 - loss 0.21943860 - time (sec): 4.37 - samples/sec: 7335.23 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:55:52,141 epoch 4 - iter 297/992 - loss 0.21580522 - time (sec): 6.50 - samples/sec: 7473.64 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:55:54,086 epoch 4 - iter 396/992 - loss 0.21521306 - time (sec): 8.44 - samples/sec: 7706.14 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:55:56,286 epoch 4 - iter 495/992 - loss 0.21303400 - time (sec): 10.65 - samples/sec: 7697.26 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:55:58,585 epoch 4 - iter 594/992 - loss 0.21386872 - time (sec): 12.94 - samples/sec: 7620.66 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:56:00,843 epoch 4 - iter 693/992 - loss 0.21292105 - time (sec): 15.20 - samples/sec: 7609.58 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:56:03,054 epoch 4 - iter 792/992 - loss 0.21167609 - time (sec): 17.41 - samples/sec: 7582.95 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:56:05,288 epoch 4 - iter 891/992 - loss 0.21036360 - time (sec): 19.65 - samples/sec: 7539.23 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:56:07,573 epoch 4 - iter 990/992 - loss 0.21009866 - time (sec): 21.93 - samples/sec: 7461.11 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:56:07,624 ----------------------------------------------------------------------------------------------------
2023-10-18 21:56:07,625 EPOCH 4 done: loss 0.2099 - lr: 0.000033
2023-10-18 21:56:09,454 DEV : loss 0.1589556187391281 - f1-score (micro avg) 0.4302
2023-10-18 21:56:09,473 saving best model
2023-10-18 21:56:09,507 ----------------------------------------------------------------------------------------------------
2023-10-18 21:56:11,681 epoch 5 - iter 99/992 - loss 0.21244283 - time (sec): 2.17 - samples/sec: 7120.21 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:56:13,909 epoch 5 - iter 198/992 - loss 0.20346866 - time (sec): 4.40 - samples/sec: 7143.14 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:56:16,179 epoch 5 - iter 297/992 - loss 0.19250101 - time (sec): 6.67 - samples/sec: 7157.94 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:56:18,394 epoch 5 - iter 396/992 - loss 0.19416248 - time (sec): 8.89 - samples/sec: 7174.94 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:56:20,709 epoch 5 - iter 495/992 - loss 0.19298598 - time (sec): 11.20 - samples/sec: 7166.12 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:56:22,990 epoch 5 - iter 594/992 - loss 0.19140411 - time (sec): 13.48 - samples/sec: 7227.29 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:56:25,289 epoch 5 - iter 693/992 - loss 0.19104653 - time (sec): 15.78 - samples/sec: 7234.47 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:56:27,581 epoch 5 - iter 792/992 - loss 0.19022986 - time (sec): 18.07 - samples/sec: 7240.62 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:56:29,785 epoch 5 - iter 891/992 - loss 0.18871481 - time (sec): 20.28 - samples/sec: 7294.24 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:56:31,990 epoch 5 - iter 990/992 - loss 0.19191013 - time (sec): 22.48 - samples/sec: 7277.88 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:56:32,037 ----------------------------------------------------------------------------------------------------
2023-10-18 21:56:32,037 EPOCH 5 done: loss 0.1918 - lr: 0.000028
2023-10-18 21:56:33,887 DEV : loss 0.15180285274982452 - f1-score (micro avg) 0.4646
2023-10-18 21:56:33,907 saving best model
2023-10-18 21:56:33,942 ----------------------------------------------------------------------------------------------------
2023-10-18 21:56:36,213 epoch 6 - iter 99/992 - loss 0.19371930 - time (sec): 2.27 - samples/sec: 7383.02 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:56:38,443 epoch 6 - iter 198/992 - loss 0.19060230 - time (sec): 4.50 - samples/sec: 7462.91 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:56:40,617 epoch 6 - iter 297/992 - loss 0.18894797 - time (sec): 6.67 - samples/sec: 7425.76 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:56:42,855 epoch 6 - iter 396/992 - loss 0.18244944 - time (sec): 8.91 - samples/sec: 7341.36 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:56:45,105 epoch 6 - iter 495/992 - loss 0.18110119 - time (sec): 11.16 - samples/sec: 7315.36 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:56:47,397 epoch 6 - iter 594/992 - loss 0.17861193 - time (sec): 13.45 - samples/sec: 7316.73 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:56:49,686 epoch 6 - iter 693/992 - loss 0.17844836 - time (sec): 15.74 - samples/sec: 7374.44 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:56:51,924 epoch 6 - iter 792/992 - loss 0.17957471 - time (sec): 17.98 - samples/sec: 7292.61 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:56:54,153 epoch 6 - iter 891/992 - loss 0.17926821 - time (sec): 20.21 - samples/sec: 7280.88 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:56:56,391 epoch 6 - iter 990/992 - loss 0.17924101 - time (sec): 22.45 - samples/sec: 7286.45 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:56:56,441 ----------------------------------------------------------------------------------------------------
2023-10-18 21:56:56,441 EPOCH 6 done: loss 0.1799 - lr: 0.000022
2023-10-18 21:56:58,301 DEV : loss 0.15175634622573853 - f1-score (micro avg) 0.4636
2023-10-18 21:56:58,320 ----------------------------------------------------------------------------------------------------
2023-10-18 21:57:00,560 epoch 7 - iter 99/992 - loss 0.17331896 - time (sec): 2.24 - samples/sec: 6979.09 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:57:02,905 epoch 7 - iter 198/992 - loss 0.17621820 - time (sec): 4.58 - samples/sec: 7191.86 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:57:05,242 epoch 7 - iter 297/992 - loss 0.17242925 - time (sec): 6.92 - samples/sec: 7159.28 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:57:07,579 epoch 7 - iter 396/992 - loss 0.16680316 - time (sec): 9.26 - samples/sec: 7077.62 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:57:09,886 epoch 7 - iter 495/992 - loss 0.16917314 - time (sec): 11.57 - samples/sec: 7049.19 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:57:12,114 epoch 7 - iter 594/992 - loss 0.16845160 - time (sec): 13.79 - samples/sec: 7097.60 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:57:14,364 epoch 7 - iter 693/992 - loss 0.16833284 - time (sec): 16.04 - samples/sec: 7120.30 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:57:16,729 epoch 7 - iter 792/992 - loss 0.16816329 - time (sec): 18.41 - samples/sec: 7189.87 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:57:18,914 epoch 7 - iter 891/992 - loss 0.16870231 - time (sec): 20.59 - samples/sec: 7207.37 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:57:21,088 epoch 7 - iter 990/992 - loss 0.17120803 - time (sec): 22.77 - samples/sec: 7184.77 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:57:21,133 ----------------------------------------------------------------------------------------------------
2023-10-18 21:57:21,133 EPOCH 7 done: loss 0.1710 - lr: 0.000017
2023-10-18 21:57:23,365 DEV : loss 0.14771050214767456 - f1-score (micro avg) 0.4614
2023-10-18 21:57:23,384 ----------------------------------------------------------------------------------------------------
2023-10-18 21:57:25,576 epoch 8 - iter 99/992 - loss 0.16145510 - time (sec): 2.19 - samples/sec: 8094.49 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:57:27,798 epoch 8 - iter 198/992 - loss 0.16238181 - time (sec): 4.41 - samples/sec: 7801.61 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:57:30,048 epoch 8 - iter 297/992 - loss 0.16042643 - time (sec): 6.66 - samples/sec: 7743.08 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:57:32,312 epoch 8 - iter 396/992 - loss 0.16123392 - time (sec): 8.93 - samples/sec: 7716.08 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:57:34,474 epoch 8 - iter 495/992 - loss 0.15882526 - time (sec): 11.09 - samples/sec: 7568.87 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:57:36,702 epoch 8 - iter 594/992 - loss 0.15852308 - time (sec): 13.32 - samples/sec: 7545.54 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:57:38,933 epoch 8 - iter 693/992 - loss 0.16090394 - time (sec): 15.55 - samples/sec: 7490.79 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:57:41,122 epoch 8 - iter 792/992 - loss 0.16162584 - time (sec): 17.74 - samples/sec: 7465.42 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:57:43,302 epoch 8 - iter 891/992 - loss 0.16350113 - time (sec): 19.92 - samples/sec: 7410.15 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:57:45,570 epoch 8 - iter 990/992 - loss 0.16534095 - time (sec): 22.19 - samples/sec: 7377.71 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:57:45,617 ----------------------------------------------------------------------------------------------------
2023-10-18 21:57:45,617 EPOCH 8 done: loss 0.1653 - lr: 0.000011
2023-10-18 21:57:47,461 DEV : loss 0.15022730827331543 - f1-score (micro avg) 0.4734
2023-10-18 21:57:47,483 saving best model
2023-10-18 21:57:47,522 ----------------------------------------------------------------------------------------------------
2023-10-18 21:57:49,845 epoch 9 - iter 99/992 - loss 0.15429276 - time (sec): 2.32 - samples/sec: 7144.31 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:57:52,069 epoch 9 - iter 198/992 - loss 0.15976108 - time (sec): 4.55 - samples/sec: 7389.48 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:57:54,419 epoch 9 - iter 297/992 - loss 0.16814053 - time (sec): 6.90 - samples/sec: 7414.35 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:57:56,610 epoch 9 - iter 396/992 - loss 0.16893864 - time (sec): 9.09 - samples/sec: 7402.74 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:57:58,842 epoch 9 - iter 495/992 - loss 0.16480279 - time (sec): 11.32 - samples/sec: 7368.59 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:58:01,049 epoch 9 - iter 594/992 - loss 0.16380758 - time (sec): 13.53 - samples/sec: 7331.97 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:58:03,242 epoch 9 - iter 693/992 - loss 0.16219906 - time (sec): 15.72 - samples/sec: 7370.67 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:58:05,443 epoch 9 - iter 792/992 - loss 0.16449280 - time (sec): 17.92 - samples/sec: 7342.15 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:58:07,702 epoch 9 - iter 891/992 - loss 0.16241771 - time (sec): 20.18 - samples/sec: 7314.62 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:58:09,923 epoch 9 - iter 990/992 - loss 0.15995791 - time (sec): 22.40 - samples/sec: 7312.03 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:58:09,963 ----------------------------------------------------------------------------------------------------
2023-10-18 21:58:09,964 EPOCH 9 done: loss 0.1599 - lr: 0.000006
2023-10-18 21:58:11,791 DEV : loss 0.15096156299114227 - f1-score (micro avg) 0.4811
2023-10-18 21:58:11,811 saving best model
2023-10-18 21:58:11,847 ----------------------------------------------------------------------------------------------------
2023-10-18 21:58:14,177 epoch 10 - iter 99/992 - loss 0.16010106 - time (sec): 2.33 - samples/sec: 6908.82 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:58:16,389 epoch 10 - iter 198/992 - loss 0.14789645 - time (sec): 4.54 - samples/sec: 7237.38 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:58:18,701 epoch 10 - iter 297/992 - loss 0.14655674 - time (sec): 6.85 - samples/sec: 7318.07 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:58:20,992 epoch 10 - iter 396/992 - loss 0.15028578 - time (sec): 9.14 - samples/sec: 7188.41 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:58:23,161 epoch 10 - iter 495/992 - loss 0.15310770 - time (sec): 11.31 - samples/sec: 7192.41 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:58:25,426 epoch 10 - iter 594/992 - loss 0.15629795 - time (sec): 13.58 - samples/sec: 7213.73 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:58:27,654 epoch 10 - iter 693/992 - loss 0.15763389 - time (sec): 15.81 - samples/sec: 7241.31 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:58:29,889 epoch 10 - iter 792/992 - loss 0.15858926 - time (sec): 18.04 - samples/sec: 7295.51 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:58:32,062 epoch 10 - iter 891/992 - loss 0.15834786 - time (sec): 20.21 - samples/sec: 7287.22 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:58:34,265 epoch 10 - iter 990/992 - loss 0.15652090 - time (sec): 22.42 - samples/sec: 7302.97 - lr: 0.000000 - momentum: 0.000000
2023-10-18 21:58:34,312 ----------------------------------------------------------------------------------------------------
2023-10-18 21:58:34,312 EPOCH 10 done: loss 0.1566 - lr: 0.000000
2023-10-18 21:58:36,128 DEV : loss 0.1495286077260971 - f1-score (micro avg) 0.4881
2023-10-18 21:58:36,146 saving best model
2023-10-18 21:58:36,204 ----------------------------------------------------------------------------------------------------
2023-10-18 21:58:36,204 Loading model from best epoch ...
2023-10-18 21:58:36,283 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 21:58:37,756
Results:
- F-score (micro) 0.542
- F-score (macro) 0.3671
- Accuracy 0.4099
By class:
precision recall f1-score support
LOC 0.7039 0.6824 0.6930 655
PER 0.2770 0.5067 0.3582 223
ORG 0.1212 0.0315 0.0500 127
micro avg 0.5242 0.5612 0.5420 1005
macro avg 0.3674 0.4069 0.3671 1005
weighted avg 0.5356 0.5612 0.5375 1005
2023-10-18 21:58:37,756 ----------------------------------------------------------------------------------------------------