stefan-it's picture
Upload folder using huggingface_hub
1b26ba7
raw
history blame
24.1 kB
2023-10-18 21:15:31,626 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,626 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Train: 7936 sentences
2023-10-18 21:15:31,627 (train_with_dev=False, train_with_test=False)
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Training Params:
2023-10-18 21:15:31,627 - learning_rate: "5e-05"
2023-10-18 21:15:31,627 - mini_batch_size: "8"
2023-10-18 21:15:31,627 - max_epochs: "10"
2023-10-18 21:15:31,627 - shuffle: "True"
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Plugins:
2023-10-18 21:15:31,627 - TensorboardLogger
2023-10-18 21:15:31,627 - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 21:15:31,627 - metric: "('micro avg', 'f1-score')"
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Computation:
2023-10-18 21:15:31,627 - compute on device: cuda:0
2023-10-18 21:15:31,627 - embedding storage: none
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 Model training base path: "hmbench-icdar/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,627 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:31,628 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 21:15:33,871 epoch 1 - iter 99/992 - loss 3.00035148 - time (sec): 2.24 - samples/sec: 7504.52 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:15:36,122 epoch 1 - iter 198/992 - loss 2.65024826 - time (sec): 4.49 - samples/sec: 7377.52 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:15:38,453 epoch 1 - iter 297/992 - loss 2.14493534 - time (sec): 6.83 - samples/sec: 7396.19 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:15:40,761 epoch 1 - iter 396/992 - loss 1.77405864 - time (sec): 9.13 - samples/sec: 7228.32 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:15:42,985 epoch 1 - iter 495/992 - loss 1.52199448 - time (sec): 11.36 - samples/sec: 7224.42 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:15:45,245 epoch 1 - iter 594/992 - loss 1.33697784 - time (sec): 13.62 - samples/sec: 7236.13 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:15:47,493 epoch 1 - iter 693/992 - loss 1.20151051 - time (sec): 15.87 - samples/sec: 7232.93 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:15:49,727 epoch 1 - iter 792/992 - loss 1.09367873 - time (sec): 18.10 - samples/sec: 7246.04 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:15:51,954 epoch 1 - iter 891/992 - loss 1.00765596 - time (sec): 20.33 - samples/sec: 7260.10 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:15:54,200 epoch 1 - iter 990/992 - loss 0.94167931 - time (sec): 22.57 - samples/sec: 7252.23 - lr: 0.000050 - momentum: 0.000000
2023-10-18 21:15:54,247 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:54,247 EPOCH 1 done: loss 0.9409 - lr: 0.000050
2023-10-18 21:15:55,813 DEV : loss 0.20528535544872284 - f1-score (micro avg) 0.2927
2023-10-18 21:15:55,832 saving best model
2023-10-18 21:15:55,870 ----------------------------------------------------------------------------------------------------
2023-10-18 21:15:58,007 epoch 2 - iter 99/992 - loss 0.29525902 - time (sec): 2.14 - samples/sec: 7453.35 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:16:00,244 epoch 2 - iter 198/992 - loss 0.30020282 - time (sec): 4.37 - samples/sec: 7681.68 - lr: 0.000049 - momentum: 0.000000
2023-10-18 21:16:02,448 epoch 2 - iter 297/992 - loss 0.29276113 - time (sec): 6.58 - samples/sec: 7608.66 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:16:04,695 epoch 2 - iter 396/992 - loss 0.28180797 - time (sec): 8.82 - samples/sec: 7520.52 - lr: 0.000048 - momentum: 0.000000
2023-10-18 21:16:06,951 epoch 2 - iter 495/992 - loss 0.27842310 - time (sec): 11.08 - samples/sec: 7417.17 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:16:09,180 epoch 2 - iter 594/992 - loss 0.27099342 - time (sec): 13.31 - samples/sec: 7460.02 - lr: 0.000047 - momentum: 0.000000
2023-10-18 21:16:11,436 epoch 2 - iter 693/992 - loss 0.26826560 - time (sec): 15.57 - samples/sec: 7433.50 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:16:13,617 epoch 2 - iter 792/992 - loss 0.26477509 - time (sec): 17.75 - samples/sec: 7393.62 - lr: 0.000046 - momentum: 0.000000
2023-10-18 21:16:15,758 epoch 2 - iter 891/992 - loss 0.26317298 - time (sec): 19.89 - samples/sec: 7333.02 - lr: 0.000045 - momentum: 0.000000
2023-10-18 21:16:17,807 epoch 2 - iter 990/992 - loss 0.25944860 - time (sec): 21.94 - samples/sec: 7465.20 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:16:17,844 ----------------------------------------------------------------------------------------------------
2023-10-18 21:16:17,845 EPOCH 2 done: loss 0.2593 - lr: 0.000044
2023-10-18 21:16:20,056 DEV : loss 0.16714806854724884 - f1-score (micro avg) 0.4111
2023-10-18 21:16:20,075 saving best model
2023-10-18 21:16:20,110 ----------------------------------------------------------------------------------------------------
2023-10-18 21:16:22,218 epoch 3 - iter 99/992 - loss 0.23036403 - time (sec): 2.11 - samples/sec: 7812.27 - lr: 0.000044 - momentum: 0.000000
2023-10-18 21:16:24,469 epoch 3 - iter 198/992 - loss 0.23058276 - time (sec): 4.36 - samples/sec: 7626.43 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:16:26,711 epoch 3 - iter 297/992 - loss 0.21820133 - time (sec): 6.60 - samples/sec: 7536.57 - lr: 0.000043 - momentum: 0.000000
2023-10-18 21:16:28,916 epoch 3 - iter 396/992 - loss 0.22504416 - time (sec): 8.81 - samples/sec: 7493.40 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:16:31,138 epoch 3 - iter 495/992 - loss 0.22164956 - time (sec): 11.03 - samples/sec: 7402.51 - lr: 0.000042 - momentum: 0.000000
2023-10-18 21:16:33,390 epoch 3 - iter 594/992 - loss 0.22088483 - time (sec): 13.28 - samples/sec: 7400.22 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:16:35,663 epoch 3 - iter 693/992 - loss 0.21848559 - time (sec): 15.55 - samples/sec: 7360.05 - lr: 0.000041 - momentum: 0.000000
2023-10-18 21:16:38,037 epoch 3 - iter 792/992 - loss 0.21608363 - time (sec): 17.93 - samples/sec: 7370.97 - lr: 0.000040 - momentum: 0.000000
2023-10-18 21:16:40,357 epoch 3 - iter 891/992 - loss 0.21449733 - time (sec): 20.25 - samples/sec: 7305.29 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:16:42,642 epoch 3 - iter 990/992 - loss 0.21398506 - time (sec): 22.53 - samples/sec: 7266.44 - lr: 0.000039 - momentum: 0.000000
2023-10-18 21:16:42,687 ----------------------------------------------------------------------------------------------------
2023-10-18 21:16:42,688 EPOCH 3 done: loss 0.2139 - lr: 0.000039
2023-10-18 21:16:44,501 DEV : loss 0.14900191128253937 - f1-score (micro avg) 0.5128
2023-10-18 21:16:44,520 saving best model
2023-10-18 21:16:44,555 ----------------------------------------------------------------------------------------------------
2023-10-18 21:16:46,843 epoch 4 - iter 99/992 - loss 0.19212402 - time (sec): 2.29 - samples/sec: 7423.04 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:16:49,021 epoch 4 - iter 198/992 - loss 0.19042800 - time (sec): 4.47 - samples/sec: 7091.02 - lr: 0.000038 - momentum: 0.000000
2023-10-18 21:16:51,197 epoch 4 - iter 297/992 - loss 0.18919111 - time (sec): 6.64 - samples/sec: 7171.20 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:16:53,390 epoch 4 - iter 396/992 - loss 0.18793688 - time (sec): 8.83 - samples/sec: 7224.86 - lr: 0.000037 - momentum: 0.000000
2023-10-18 21:16:55,585 epoch 4 - iter 495/992 - loss 0.19170755 - time (sec): 11.03 - samples/sec: 7327.72 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:16:57,801 epoch 4 - iter 594/992 - loss 0.18825759 - time (sec): 13.25 - samples/sec: 7363.35 - lr: 0.000036 - momentum: 0.000000
2023-10-18 21:17:00,013 epoch 4 - iter 693/992 - loss 0.19066190 - time (sec): 15.46 - samples/sec: 7338.86 - lr: 0.000035 - momentum: 0.000000
2023-10-18 21:17:02,303 epoch 4 - iter 792/992 - loss 0.18828028 - time (sec): 17.75 - samples/sec: 7312.02 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:17:04,532 epoch 4 - iter 891/992 - loss 0.18940494 - time (sec): 19.98 - samples/sec: 7301.53 - lr: 0.000034 - momentum: 0.000000
2023-10-18 21:17:06,861 epoch 4 - iter 990/992 - loss 0.18948850 - time (sec): 22.31 - samples/sec: 7335.54 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:17:06,913 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:06,913 EPOCH 4 done: loss 0.1897 - lr: 0.000033
2023-10-18 21:17:08,764 DEV : loss 0.14117993414402008 - f1-score (micro avg) 0.5446
2023-10-18 21:17:08,783 saving best model
2023-10-18 21:17:08,817 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:11,009 epoch 5 - iter 99/992 - loss 0.15229802 - time (sec): 2.19 - samples/sec: 7372.22 - lr: 0.000033 - momentum: 0.000000
2023-10-18 21:17:13,239 epoch 5 - iter 198/992 - loss 0.16129571 - time (sec): 4.42 - samples/sec: 7336.97 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:17:15,485 epoch 5 - iter 297/992 - loss 0.16681954 - time (sec): 6.67 - samples/sec: 7239.47 - lr: 0.000032 - momentum: 0.000000
2023-10-18 21:17:17,603 epoch 5 - iter 396/992 - loss 0.16422500 - time (sec): 8.78 - samples/sec: 7408.94 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:17:19,858 epoch 5 - iter 495/992 - loss 0.16568909 - time (sec): 11.04 - samples/sec: 7348.81 - lr: 0.000031 - momentum: 0.000000
2023-10-18 21:17:22,037 epoch 5 - iter 594/992 - loss 0.16934167 - time (sec): 13.22 - samples/sec: 7304.06 - lr: 0.000030 - momentum: 0.000000
2023-10-18 21:17:24,279 epoch 5 - iter 693/992 - loss 0.17025896 - time (sec): 15.46 - samples/sec: 7320.71 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:17:26,566 epoch 5 - iter 792/992 - loss 0.17179453 - time (sec): 17.75 - samples/sec: 7355.20 - lr: 0.000029 - momentum: 0.000000
2023-10-18 21:17:28,773 epoch 5 - iter 891/992 - loss 0.17143573 - time (sec): 19.96 - samples/sec: 7392.66 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:17:30,997 epoch 5 - iter 990/992 - loss 0.17025017 - time (sec): 22.18 - samples/sec: 7379.74 - lr: 0.000028 - momentum: 0.000000
2023-10-18 21:17:31,042 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:31,042 EPOCH 5 done: loss 0.1701 - lr: 0.000028
2023-10-18 21:17:32,868 DEV : loss 0.1357688307762146 - f1-score (micro avg) 0.5582
2023-10-18 21:17:32,887 saving best model
2023-10-18 21:17:32,922 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:35,146 epoch 6 - iter 99/992 - loss 0.17300109 - time (sec): 2.22 - samples/sec: 7210.04 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:17:37,367 epoch 6 - iter 198/992 - loss 0.17503273 - time (sec): 4.44 - samples/sec: 7303.51 - lr: 0.000027 - momentum: 0.000000
2023-10-18 21:17:39,620 epoch 6 - iter 297/992 - loss 0.16777607 - time (sec): 6.70 - samples/sec: 7436.18 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:17:41,855 epoch 6 - iter 396/992 - loss 0.16541905 - time (sec): 8.93 - samples/sec: 7339.38 - lr: 0.000026 - momentum: 0.000000
2023-10-18 21:17:44,101 epoch 6 - iter 495/992 - loss 0.16309203 - time (sec): 11.18 - samples/sec: 7376.75 - lr: 0.000025 - momentum: 0.000000
2023-10-18 21:17:46,202 epoch 6 - iter 594/992 - loss 0.16325905 - time (sec): 13.28 - samples/sec: 7387.33 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:17:48,434 epoch 6 - iter 693/992 - loss 0.16011057 - time (sec): 15.51 - samples/sec: 7404.51 - lr: 0.000024 - momentum: 0.000000
2023-10-18 21:17:50,641 epoch 6 - iter 792/992 - loss 0.15712117 - time (sec): 17.72 - samples/sec: 7412.21 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:17:52,843 epoch 6 - iter 891/992 - loss 0.16041497 - time (sec): 19.92 - samples/sec: 7364.21 - lr: 0.000023 - momentum: 0.000000
2023-10-18 21:17:55,110 epoch 6 - iter 990/992 - loss 0.15981362 - time (sec): 22.19 - samples/sec: 7375.09 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:17:55,159 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:55,159 EPOCH 6 done: loss 0.1597 - lr: 0.000022
2023-10-18 21:17:57,029 DEV : loss 0.13293296098709106 - f1-score (micro avg) 0.5787
2023-10-18 21:17:57,048 saving best model
2023-10-18 21:17:57,082 ----------------------------------------------------------------------------------------------------
2023-10-18 21:17:59,315 epoch 7 - iter 99/992 - loss 0.13942076 - time (sec): 2.23 - samples/sec: 7690.95 - lr: 0.000022 - momentum: 0.000000
2023-10-18 21:18:01,494 epoch 7 - iter 198/992 - loss 0.14325448 - time (sec): 4.41 - samples/sec: 7803.96 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:18:03,759 epoch 7 - iter 297/992 - loss 0.14913377 - time (sec): 6.68 - samples/sec: 7639.00 - lr: 0.000021 - momentum: 0.000000
2023-10-18 21:18:05,986 epoch 7 - iter 396/992 - loss 0.15476655 - time (sec): 8.90 - samples/sec: 7534.24 - lr: 0.000020 - momentum: 0.000000
2023-10-18 21:18:08,218 epoch 7 - iter 495/992 - loss 0.15347097 - time (sec): 11.14 - samples/sec: 7513.37 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:18:10,453 epoch 7 - iter 594/992 - loss 0.15068835 - time (sec): 13.37 - samples/sec: 7485.75 - lr: 0.000019 - momentum: 0.000000
2023-10-18 21:18:12,712 epoch 7 - iter 693/992 - loss 0.14857101 - time (sec): 15.63 - samples/sec: 7470.10 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:18:14,906 epoch 7 - iter 792/992 - loss 0.14959639 - time (sec): 17.82 - samples/sec: 7457.26 - lr: 0.000018 - momentum: 0.000000
2023-10-18 21:18:17,123 epoch 7 - iter 891/992 - loss 0.14984248 - time (sec): 20.04 - samples/sec: 7383.31 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:18:19,314 epoch 7 - iter 990/992 - loss 0.15087522 - time (sec): 22.23 - samples/sec: 7355.24 - lr: 0.000017 - momentum: 0.000000
2023-10-18 21:18:19,366 ----------------------------------------------------------------------------------------------------
2023-10-18 21:18:19,366 EPOCH 7 done: loss 0.1508 - lr: 0.000017
2023-10-18 21:18:21,594 DEV : loss 0.13094773888587952 - f1-score (micro avg) 0.5961
2023-10-18 21:18:21,612 saving best model
2023-10-18 21:18:21,648 ----------------------------------------------------------------------------------------------------
2023-10-18 21:18:23,910 epoch 8 - iter 99/992 - loss 0.13743854 - time (sec): 2.26 - samples/sec: 7400.23 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:18:26,262 epoch 8 - iter 198/992 - loss 0.14110764 - time (sec): 4.61 - samples/sec: 7204.42 - lr: 0.000016 - momentum: 0.000000
2023-10-18 21:18:28,496 epoch 8 - iter 297/992 - loss 0.14324894 - time (sec): 6.85 - samples/sec: 7150.71 - lr: 0.000015 - momentum: 0.000000
2023-10-18 21:18:30,689 epoch 8 - iter 396/992 - loss 0.14564214 - time (sec): 9.04 - samples/sec: 7201.23 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:18:32,915 epoch 8 - iter 495/992 - loss 0.14616240 - time (sec): 11.27 - samples/sec: 7330.10 - lr: 0.000014 - momentum: 0.000000
2023-10-18 21:18:35,175 epoch 8 - iter 594/992 - loss 0.14778882 - time (sec): 13.53 - samples/sec: 7299.12 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:18:37,371 epoch 8 - iter 693/992 - loss 0.14760928 - time (sec): 15.72 - samples/sec: 7260.68 - lr: 0.000013 - momentum: 0.000000
2023-10-18 21:18:39,585 epoch 8 - iter 792/992 - loss 0.14535005 - time (sec): 17.94 - samples/sec: 7304.22 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:18:41,814 epoch 8 - iter 891/992 - loss 0.14591531 - time (sec): 20.16 - samples/sec: 7302.19 - lr: 0.000012 - momentum: 0.000000
2023-10-18 21:18:44,017 epoch 8 - iter 990/992 - loss 0.14527308 - time (sec): 22.37 - samples/sec: 7313.69 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:18:44,065 ----------------------------------------------------------------------------------------------------
2023-10-18 21:18:44,065 EPOCH 8 done: loss 0.1451 - lr: 0.000011
2023-10-18 21:18:45,896 DEV : loss 0.13168883323669434 - f1-score (micro avg) 0.5982
2023-10-18 21:18:45,915 saving best model
2023-10-18 21:18:45,951 ----------------------------------------------------------------------------------------------------
2023-10-18 21:18:48,175 epoch 9 - iter 99/992 - loss 0.13619358 - time (sec): 2.22 - samples/sec: 7282.70 - lr: 0.000011 - momentum: 0.000000
2023-10-18 21:18:50,442 epoch 9 - iter 198/992 - loss 0.13633010 - time (sec): 4.49 - samples/sec: 7272.94 - lr: 0.000010 - momentum: 0.000000
2023-10-18 21:18:52,662 epoch 9 - iter 297/992 - loss 0.13650790 - time (sec): 6.71 - samples/sec: 7255.07 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:18:54,868 epoch 9 - iter 396/992 - loss 0.13657536 - time (sec): 8.92 - samples/sec: 7228.79 - lr: 0.000009 - momentum: 0.000000
2023-10-18 21:18:57,157 epoch 9 - iter 495/992 - loss 0.13625592 - time (sec): 11.21 - samples/sec: 7357.39 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:18:59,407 epoch 9 - iter 594/992 - loss 0.13653751 - time (sec): 13.46 - samples/sec: 7356.54 - lr: 0.000008 - momentum: 0.000000
2023-10-18 21:19:01,641 epoch 9 - iter 693/992 - loss 0.13850946 - time (sec): 15.69 - samples/sec: 7338.11 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:19:03,933 epoch 9 - iter 792/992 - loss 0.13733639 - time (sec): 17.98 - samples/sec: 7322.44 - lr: 0.000007 - momentum: 0.000000
2023-10-18 21:19:06,261 epoch 9 - iter 891/992 - loss 0.13746495 - time (sec): 20.31 - samples/sec: 7268.68 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:19:08,482 epoch 9 - iter 990/992 - loss 0.13949435 - time (sec): 22.53 - samples/sec: 7265.99 - lr: 0.000006 - momentum: 0.000000
2023-10-18 21:19:08,526 ----------------------------------------------------------------------------------------------------
2023-10-18 21:19:08,527 EPOCH 9 done: loss 0.1393 - lr: 0.000006
2023-10-18 21:19:10,353 DEV : loss 0.13056860864162445 - f1-score (micro avg) 0.6056
2023-10-18 21:19:10,372 saving best model
2023-10-18 21:19:10,406 ----------------------------------------------------------------------------------------------------
2023-10-18 21:19:12,669 epoch 10 - iter 99/992 - loss 0.13940439 - time (sec): 2.26 - samples/sec: 7075.75 - lr: 0.000005 - momentum: 0.000000
2023-10-18 21:19:14,890 epoch 10 - iter 198/992 - loss 0.13866219 - time (sec): 4.48 - samples/sec: 7328.81 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:19:17,089 epoch 10 - iter 297/992 - loss 0.13654053 - time (sec): 6.68 - samples/sec: 7366.53 - lr: 0.000004 - momentum: 0.000000
2023-10-18 21:19:19,280 epoch 10 - iter 396/992 - loss 0.13952155 - time (sec): 8.87 - samples/sec: 7422.76 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:19:21,607 epoch 10 - iter 495/992 - loss 0.13874725 - time (sec): 11.20 - samples/sec: 7386.53 - lr: 0.000003 - momentum: 0.000000
2023-10-18 21:19:23,650 epoch 10 - iter 594/992 - loss 0.14103061 - time (sec): 13.24 - samples/sec: 7453.94 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:19:25,737 epoch 10 - iter 693/992 - loss 0.14107427 - time (sec): 15.33 - samples/sec: 7458.99 - lr: 0.000002 - momentum: 0.000000
2023-10-18 21:19:27,975 epoch 10 - iter 792/992 - loss 0.13930647 - time (sec): 17.57 - samples/sec: 7445.01 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:19:30,292 epoch 10 - iter 891/992 - loss 0.14078958 - time (sec): 19.89 - samples/sec: 7377.90 - lr: 0.000001 - momentum: 0.000000
2023-10-18 21:19:32,519 epoch 10 - iter 990/992 - loss 0.13836888 - time (sec): 22.11 - samples/sec: 7400.86 - lr: 0.000000 - momentum: 0.000000
2023-10-18 21:19:32,565 ----------------------------------------------------------------------------------------------------
2023-10-18 21:19:32,565 EPOCH 10 done: loss 0.1383 - lr: 0.000000
2023-10-18 21:19:34,388 DEV : loss 0.13230597972869873 - f1-score (micro avg) 0.6031
2023-10-18 21:19:34,434 ----------------------------------------------------------------------------------------------------
2023-10-18 21:19:34,435 Loading model from best epoch ...
2023-10-18 21:19:34,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-18 21:19:36,025
Results:
- F-score (micro) 0.6226
- F-score (macro) 0.4557
- Accuracy 0.4912
By class:
precision recall f1-score support
LOC 0.7189 0.7496 0.7339 655
PER 0.4162 0.6233 0.4991 223
ORG 0.2973 0.0866 0.1341 127
micro avg 0.6082 0.6378 0.6226 1005
macro avg 0.4775 0.4865 0.4557 1005
weighted avg 0.5984 0.6378 0.6060 1005
2023-10-18 21:19:36,025 ----------------------------------------------------------------------------------------------------