stefan-it's picture
Upload folder using huggingface_hub
bc3d857
2023-10-18 21:44:50,558 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Train: 7936 sentences
2023-10-18 21:44:50,559 (train_with_dev=False, train_with_test=False)
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Training Params:
2023-10-18 21:44:50,559 - learning_rate: "5e-05"
2023-10-18 21:44:50,559 - mini_batch_size: "4"
2023-10-18 21:44:50,559 - max_epochs: "10"
2023-10-18 21:44:50,559 - shuffle: "True"
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Plugins:
2023-10-18 21:44:50,559 - TensorboardLogger
2023-10-18 21:44:50,559 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 21:44:50,559 - metric: "('micro avg', 'f1-score')"
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,559 Computation:
2023-10-18 21:44:50,559 - compute on device: cuda:0
2023-10-18 21:44:50,559 - embedding storage: none
2023-10-18 21:44:50,559 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,560 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-18 21:44:50,560 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,560 ----------------------------------------------------------------------------------------------------
2023-10-18 21:44:50,560 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 21:44:53,702 epoch 1 - iter 198/1984 - loss 3.02755835 - time (sec): 3.14 - samples/sec: 5433.82 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:44:56,723 epoch 1 - iter 396/1984 - loss 2.50705750 - time (sec): 6.16 - samples/sec: 5318.29 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:44:59,776 epoch 1 - iter 594/1984 - loss 1.89414489 - time (sec): 9.22 - samples/sec: 5401.23 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:45:02,860 epoch 1 - iter 792/1984 - loss 1.52591738 - time (sec): 12.30 - samples/sec: 5438.96 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:45:05,884 epoch 1 - iter 990/1984 - loss 1.32097567 - time (sec): 15.32 - samples/sec: 5430.32 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:45:08,933 epoch 1 - iter 1188/1984 - loss 1.16575852 - time (sec): 18.37 - samples/sec: 5416.07 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:45:11,986 epoch 1 - iter 1386/1984 - loss 1.04445703 - time (sec): 21.43 - samples/sec: 5421.99 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:45:15,011 epoch 1 - iter 1584/1984 - loss 0.95783838 - time (sec): 24.45 - samples/sec: 5385.50 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:45:18,008 epoch 1 - iter 1782/1984 - loss 0.88791048 - time (sec): 27.45 - samples/sec: 5377.95 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:45:21,038 epoch 1 - iter 1980/1984 - loss 0.83155339 - time (sec): 30.48 - samples/sec: 5371.39 - lr: 0.000050 - momentum: 0.000000
2023-10-18 21:45:21,095 ----------------------------------------------------------------------------------------------------
2023-10-18 21:45:21,095 EPOCH 1 done: loss 0.8309 - lr: 0.000050
2023-10-18 21:45:22,631 DEV : loss 0.21765930950641632 - f1-score (micro avg) 0.3337
2023-10-18 21:45:22,649 saving best model
2023-10-18 21:45:22,682 ----------------------------------------------------------------------------------------------------
2023-10-18 21:45:25,694 epoch 2 - iter 198/1984 - loss 0.31683963 - time (sec): 3.01 - samples/sec: 5386.59 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:45:28,770 epoch 2 - iter 396/1984 - loss 0.29456292 - time (sec): 6.09 - samples/sec: 5410.37 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:45:31,796 epoch 2 - iter 594/1984 - loss 0.28745860 - time (sec): 9.11 - samples/sec: 5414.94 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:45:34,843 epoch 2 - iter 792/1984 - loss 0.28332924 - time (sec): 12.16 - samples/sec: 5467.12 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:45:37,882 epoch 2 - iter 990/1984 - loss 0.27820558 - time (sec): 15.20 - samples/sec: 5452.84 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:45:40,914 epoch 2 - iter 1188/1984 - loss 0.27692060 - time (sec): 18.23 - samples/sec: 5428.66 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:45:43,924 epoch 2 - iter 1386/1984 - loss 0.27338747 - time (sec): 21.24 - samples/sec: 5368.92 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:45:46,946 epoch 2 - iter 1584/1984 - loss 0.27329556 - time (sec): 24.26 - samples/sec: 5333.65 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:45:50,005 epoch 2 - iter 1782/1984 - loss 0.26872899 - time (sec): 27.32 - samples/sec: 5362.45 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:45:53,066 epoch 2 - iter 1980/1984 - loss 0.26392738 - time (sec): 30.38 - samples/sec: 5383.14 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:45:53,129 ----------------------------------------------------------------------------------------------------
2023-10-18 21:45:53,129 EPOCH 2 done: loss 0.2637 - lr: 0.000044
2023-10-18 21:45:54,965 DEV : loss 0.18782073259353638 - f1-score (micro avg) 0.3894
2023-10-18 21:45:54,983 saving best model
2023-10-18 21:45:55,015 ----------------------------------------------------------------------------------------------------
2023-10-18 21:45:57,994 epoch 3 - iter 198/1984 - loss 0.23184493 - time (sec): 2.98 - samples/sec: 5373.79 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:46:00,989 epoch 3 - iter 396/1984 - loss 0.22422770 - time (sec): 5.97 - samples/sec: 5368.69 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:46:03,985 epoch 3 - iter 594/1984 - loss 0.21842906 - time (sec): 8.97 - samples/sec: 5335.00 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:46:07,452 epoch 3 - iter 792/1984 - loss 0.22427957 - time (sec): 12.44 - samples/sec: 5161.76 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:46:10,251 epoch 3 - iter 990/1984 - loss 0.22526202 - time (sec): 15.23 - samples/sec: 5294.55 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:46:13,311 epoch 3 - iter 1188/1984 - loss 0.22436343 - time (sec): 18.29 - samples/sec: 5313.27 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:46:16,377 epoch 3 - iter 1386/1984 - loss 0.22350554 - time (sec): 21.36 - samples/sec: 5321.49 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:46:19,427 epoch 3 - iter 1584/1984 - loss 0.22239930 - time (sec): 24.41 - samples/sec: 5338.02 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:46:22,480 epoch 3 - iter 1782/1984 - loss 0.22294790 - time (sec): 27.46 - samples/sec: 5331.42 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:46:25,656 epoch 3 - iter 1980/1984 - loss 0.21981448 - time (sec): 30.64 - samples/sec: 5343.64 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:46:25,714 ----------------------------------------------------------------------------------------------------
2023-10-18 21:46:25,714 EPOCH 3 done: loss 0.2202 - lr: 0.000039
2023-10-18 21:46:27,536 DEV : loss 0.16426852345466614 - f1-score (micro avg) 0.4378
2023-10-18 21:46:27,556 saving best model
2023-10-18 21:46:27,591 ----------------------------------------------------------------------------------------------------
2023-10-18 21:46:30,645 epoch 4 - iter 198/1984 - loss 0.21821162 - time (sec): 3.05 - samples/sec: 5334.38 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:46:33,608 epoch 4 - iter 396/1984 - loss 0.20572668 - time (sec): 6.02 - samples/sec: 5322.24 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:46:36,639 epoch 4 - iter 594/1984 - loss 0.20384467 - time (sec): 9.05 - samples/sec: 5369.64 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:46:39,741 epoch 4 - iter 792/1984 - loss 0.20226613 - time (sec): 12.15 - samples/sec: 5356.48 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:46:42,759 epoch 4 - iter 990/1984 - loss 0.20003297 - time (sec): 15.17 - samples/sec: 5402.33 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:46:45,772 epoch 4 - iter 1188/1984 - loss 0.20039277 - time (sec): 18.18 - samples/sec: 5425.68 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:46:48,837 epoch 4 - iter 1386/1984 - loss 0.20002361 - time (sec): 21.25 - samples/sec: 5445.13 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:46:51,873 epoch 4 - iter 1584/1984 - loss 0.19776710 - time (sec): 24.28 - samples/sec: 5438.11 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:46:54,901 epoch 4 - iter 1782/1984 - loss 0.19638486 - time (sec): 27.31 - samples/sec: 5423.82 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:46:57,951 epoch 4 - iter 1980/1984 - loss 0.19597838 - time (sec): 30.36 - samples/sec: 5390.08 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:46:58,013 ----------------------------------------------------------------------------------------------------
2023-10-18 21:46:58,013 EPOCH 4 done: loss 0.1958 - lr: 0.000033
2023-10-18 21:46:59,836 DEV : loss 0.1590408831834793 - f1-score (micro avg) 0.4645
2023-10-18 21:46:59,855 saving best model
2023-10-18 21:46:59,889 ----------------------------------------------------------------------------------------------------
2023-10-18 21:47:02,987 epoch 5 - iter 198/1984 - loss 0.19862903 - time (sec): 3.10 - samples/sec: 4997.21 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:47:06,060 epoch 5 - iter 396/1984 - loss 0.19073285 - time (sec): 6.17 - samples/sec: 5095.42 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:47:09,118 epoch 5 - iter 594/1984 - loss 0.18017645 - time (sec): 9.23 - samples/sec: 5174.83 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:47:12,192 epoch 5 - iter 792/1984 - loss 0.18150575 - time (sec): 12.30 - samples/sec: 5182.82 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:47:15,250 epoch 5 - iter 990/1984 - loss 0.18132472 - time (sec): 15.36 - samples/sec: 5225.93 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:47:18,322 epoch 5 - iter 1188/1984 - loss 0.17910642 - time (sec): 18.43 - samples/sec: 5286.47 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:47:21,373 epoch 5 - iter 1386/1984 - loss 0.17861652 - time (sec): 21.48 - samples/sec: 5314.48 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:47:24,445 epoch 5 - iter 1584/1984 - loss 0.17851281 - time (sec): 24.56 - samples/sec: 5329.37 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:47:27,499 epoch 5 - iter 1782/1984 - loss 0.17681625 - time (sec): 27.61 - samples/sec: 5357.24 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:47:30,515 epoch 5 - iter 1980/1984 - loss 0.17933583 - time (sec): 30.62 - samples/sec: 5342.88 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:47:30,583 ----------------------------------------------------------------------------------------------------
2023-10-18 21:47:30,583 EPOCH 5 done: loss 0.1792 - lr: 0.000028
2023-10-18 21:47:32,431 DEV : loss 0.15715673565864563 - f1-score (micro avg) 0.5065
2023-10-18 21:47:32,449 saving best model
2023-10-18 21:47:32,487 ----------------------------------------------------------------------------------------------------
2023-10-18 21:47:35,526 epoch 6 - iter 198/1984 - loss 0.18455211 - time (sec): 3.04 - samples/sec: 5515.02 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:47:38,795 epoch 6 - iter 396/1984 - loss 0.18436525 - time (sec): 6.31 - samples/sec: 5325.15 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:47:41,821 epoch 6 - iter 594/1984 - loss 0.17975805 - time (sec): 9.33 - samples/sec: 5309.96 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:47:44,832 epoch 6 - iter 792/1984 - loss 0.17157674 - time (sec): 12.34 - samples/sec: 5300.16 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:47:47,848 epoch 6 - iter 990/1984 - loss 0.17078466 - time (sec): 15.36 - samples/sec: 5315.70 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:47:50,932 epoch 6 - iter 1188/1984 - loss 0.16688998 - time (sec): 18.44 - samples/sec: 5337.14 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:47:54,002 epoch 6 - iter 1386/1984 - loss 0.16587726 - time (sec): 21.51 - samples/sec: 5396.04 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:47:57,025 epoch 6 - iter 1584/1984 - loss 0.16677262 - time (sec): 24.54 - samples/sec: 5343.91 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:48:00,020 epoch 6 - iter 1782/1984 - loss 0.16683507 - time (sec): 27.53 - samples/sec: 5344.66 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:48:03,006 epoch 6 - iter 1980/1984 - loss 0.16690010 - time (sec): 30.52 - samples/sec: 5359.47 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:48:03,067 ----------------------------------------------------------------------------------------------------
2023-10-18 21:48:03,067 EPOCH 6 done: loss 0.1674 - lr: 0.000022
2023-10-18 21:48:04,904 DEV : loss 0.15758880972862244 - f1-score (micro avg) 0.5325
2023-10-18 21:48:04,922 saving best model
2023-10-18 21:48:04,961 ----------------------------------------------------------------------------------------------------
2023-10-18 21:48:07,977 epoch 7 - iter 198/1984 - loss 0.15459941 - time (sec): 3.02 - samples/sec: 5183.56 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:48:10,991 epoch 7 - iter 396/1984 - loss 0.16073533 - time (sec): 6.03 - samples/sec: 5469.06 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:48:14,051 epoch 7 - iter 594/1984 - loss 0.15792323 - time (sec): 9.09 - samples/sec: 5452.10 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:48:17,116 epoch 7 - iter 792/1984 - loss 0.15303565 - time (sec): 12.15 - samples/sec: 5391.67 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:48:20,201 epoch 7 - iter 990/1984 - loss 0.15501183 - time (sec): 15.24 - samples/sec: 5350.01 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:48:23,341 epoch 7 - iter 1188/1984 - loss 0.15432518 - time (sec): 18.38 - samples/sec: 5327.01 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:48:26,378 epoch 7 - iter 1386/1984 - loss 0.15354368 - time (sec): 21.42 - samples/sec: 5334.08 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:48:29,540 epoch 7 - iter 1584/1984 - loss 0.15398265 - time (sec): 24.58 - samples/sec: 5384.87 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:48:32,576 epoch 7 - iter 1782/1984 - loss 0.15459399 - time (sec): 27.61 - samples/sec: 5374.90 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:48:35,610 epoch 7 - iter 1980/1984 - loss 0.15754883 - time (sec): 30.65 - samples/sec: 5337.49 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:48:35,671 ----------------------------------------------------------------------------------------------------
2023-10-18 21:48:35,671 EPOCH 7 done: loss 0.1573 - lr: 0.000017
2023-10-18 21:48:37,508 DEV : loss 0.15788982808589935 - f1-score (micro avg) 0.5569
2023-10-18 21:48:37,527 saving best model
2023-10-18 21:48:37,561 ----------------------------------------------------------------------------------------------------
2023-10-18 21:48:40,596 epoch 8 - iter 198/1984 - loss 0.14924294 - time (sec): 3.03 - samples/sec: 5847.53 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:48:43,640 epoch 8 - iter 396/1984 - loss 0.14667895 - time (sec): 6.08 - samples/sec: 5664.45 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:48:46,622 epoch 8 - iter 594/1984 - loss 0.14204086 - time (sec): 9.06 - samples/sec: 5694.55 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:48:49,614 epoch 8 - iter 792/1984 - loss 0.14280278 - time (sec): 12.05 - samples/sec: 5715.33 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:48:52,626 epoch 8 - iter 990/1984 - loss 0.14084806 - time (sec): 15.06 - samples/sec: 5572.09 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:48:55,678 epoch 8 - iter 1188/1984 - loss 0.14241472 - time (sec): 18.12 - samples/sec: 5546.88 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:48:58,761 epoch 8 - iter 1386/1984 - loss 0.14358925 - time (sec): 21.20 - samples/sec: 5494.28 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:49:01,761 epoch 8 - iter 1584/1984 - loss 0.14471371 - time (sec): 24.20 - samples/sec: 5472.04 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:49:04,782 epoch 8 - iter 1782/1984 - loss 0.14729854 - time (sec): 27.22 - samples/sec: 5422.32 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:49:07,819 epoch 8 - iter 1980/1984 - loss 0.14902796 - time (sec): 30.26 - samples/sec: 5409.51 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:49:07,876 ----------------------------------------------------------------------------------------------------
2023-10-18 21:49:07,876 EPOCH 8 done: loss 0.1490 - lr: 0.000011
2023-10-18 21:49:10,078 DEV : loss 0.15831266343593597 - f1-score (micro avg) 0.5582
2023-10-18 21:49:10,097 saving best model
2023-10-18 21:49:10,130 ----------------------------------------------------------------------------------------------------
2023-10-18 21:49:13,217 epoch 9 - iter 198/1984 - loss 0.13787577 - time (sec): 3.09 - samples/sec: 5374.53 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:49:16,276 epoch 9 - iter 396/1984 - loss 0.14301758 - time (sec): 6.15 - samples/sec: 5466.73 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:49:19,440 epoch 9 - iter 594/1984 - loss 0.14924873 - time (sec): 9.31 - samples/sec: 5493.05 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:49:22,490 epoch 9 - iter 792/1984 - loss 0.15082222 - time (sec): 12.36 - samples/sec: 5442.31 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:49:25,512 epoch 9 - iter 990/1984 - loss 0.14676994 - time (sec): 15.38 - samples/sec: 5422.54 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:49:28,365 epoch 9 - iter 1188/1984 - loss 0.14564122 - time (sec): 18.23 - samples/sec: 5438.66 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:49:31,146 epoch 9 - iter 1386/1984 - loss 0.14469641 - time (sec): 21.02 - samples/sec: 5512.99 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:49:34,194 epoch 9 - iter 1584/1984 - loss 0.14740619 - time (sec): 24.06 - samples/sec: 5467.90 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:49:37,127 epoch 9 - iter 1782/1984 - loss 0.14524676 - time (sec): 27.00 - samples/sec: 5467.48 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:49:40,115 epoch 9 - iter 1980/1984 - loss 0.14330917 - time (sec): 29.98 - samples/sec: 5462.56 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:49:40,171 ----------------------------------------------------------------------------------------------------
2023-10-18 21:49:40,171 EPOCH 9 done: loss 0.1432 - lr: 0.000006
2023-10-18 21:49:41,979 DEV : loss 0.16277986764907837 - f1-score (micro avg) 0.5731
2023-10-18 21:49:41,998 saving best model
2023-10-18 21:49:42,031 ----------------------------------------------------------------------------------------------------
2023-10-18 21:49:45,068 epoch 10 - iter 198/1984 - loss 0.14623963 - time (sec): 3.04 - samples/sec: 5299.69 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:49:48,152 epoch 10 - iter 396/1984 - loss 0.13592264 - time (sec): 6.12 - samples/sec: 5369.93 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:49:51,235 epoch 10 - iter 594/1984 - loss 0.13273236 - time (sec): 9.20 - samples/sec: 5449.73 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:49:54,260 epoch 10 - iter 792/1984 - loss 0.13486998 - time (sec): 12.23 - samples/sec: 5375.78 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:49:57,283 epoch 10 - iter 990/1984 - loss 0.13630045 - time (sec): 15.25 - samples/sec: 5335.35 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:50:00,321 epoch 10 - iter 1188/1984 - loss 0.13938042 - time (sec): 18.29 - samples/sec: 5355.77 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:50:03,337 epoch 10 - iter 1386/1984 - loss 0.14082238 - time (sec): 21.30 - samples/sec: 5372.41 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:50:06,512 epoch 10 - iter 1584/1984 - loss 0.14201225 - time (sec): 24.48 - samples/sec: 5376.70 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:50:09,464 epoch 10 - iter 1782/1984 - loss 0.14275124 - time (sec): 27.43 - samples/sec: 5369.93 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:50:12,545 epoch 10 - iter 1980/1984 - loss 0.14099621 - time (sec): 30.51 - samples/sec: 5365.36 - lr: 0.000000 - momentum: 0.000000
2023-10-18 21:50:12,606 ----------------------------------------------------------------------------------------------------
2023-10-18 21:50:12,606 EPOCH 10 done: loss 0.1412 - lr: 0.000000
2023-10-18 21:50:14,448 DEV : loss 0.1613704115152359 - f1-score (micro avg) 0.5714
2023-10-18 21:50:14,498 ----------------------------------------------------------------------------------------------------
2023-10-18 21:50:14,498 Loading model from best epoch ...
2023-10-18 21:50:14,584 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 21:50:16,122
Results:
- F-score (micro) 0.579
- F-score (macro) 0.4219
- Accuracy 0.4471
By class:
precision recall f1-score support
LOC 0.7049 0.7038 0.7044 655
PER 0.3574 0.5336 0.4281 223
ORG 0.2264 0.0945 0.1333 127
micro avg 0.5692 0.5891 0.5790 1005
macro avg 0.4296 0.4440 0.4219 1005
weighted avg 0.5673 0.5891 0.5709 1005
2023-10-18 21:50:16,122 ----------------------------------------------------------------------------------------------------