stefan-it's picture
Upload folder using huggingface_hub
e006c16
2023-10-13 08:23:28,457 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,458 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 08:23:28,458 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,458 MultiCorpus: 1100 train + 206 dev + 240 test sentences
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Train: 1100 sentences
2023-10-13 08:23:28,459 (train_with_dev=False, train_with_test=False)
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Training Params:
2023-10-13 08:23:28,459 - learning_rate: "5e-05"
2023-10-13 08:23:28,459 - mini_batch_size: "4"
2023-10-13 08:23:28,459 - max_epochs: "10"
2023-10-13 08:23:28,459 - shuffle: "True"
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Plugins:
2023-10-13 08:23:28,459 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 08:23:28,459 - metric: "('micro avg', 'f1-score')"
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Computation:
2023-10-13 08:23:28,459 - compute on device: cuda:0
2023-10-13 08:23:28,459 - embedding storage: none
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:28,459 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:29,713 epoch 1 - iter 27/275 - loss 3.39902518 - time (sec): 1.25 - samples/sec: 1969.29 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:23:30,927 epoch 1 - iter 54/275 - loss 2.87294952 - time (sec): 2.47 - samples/sec: 1848.98 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:23:32,220 epoch 1 - iter 81/275 - loss 2.22493882 - time (sec): 3.76 - samples/sec: 1786.06 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:23:33,660 epoch 1 - iter 108/275 - loss 1.83215945 - time (sec): 5.20 - samples/sec: 1792.67 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:23:35,031 epoch 1 - iter 135/275 - loss 1.64868900 - time (sec): 6.57 - samples/sec: 1720.13 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:23:36,334 epoch 1 - iter 162/275 - loss 1.49452799 - time (sec): 7.87 - samples/sec: 1699.63 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:23:37,694 epoch 1 - iter 189/275 - loss 1.36894456 - time (sec): 9.23 - samples/sec: 1700.21 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:23:39,006 epoch 1 - iter 216/275 - loss 1.22942202 - time (sec): 10.55 - samples/sec: 1725.70 - lr: 0.000039 - momentum: 0.000000
2023-10-13 08:23:40,220 epoch 1 - iter 243/275 - loss 1.13670758 - time (sec): 11.76 - samples/sec: 1724.96 - lr: 0.000044 - momentum: 0.000000
2023-10-13 08:23:41,359 epoch 1 - iter 270/275 - loss 1.06530735 - time (sec): 12.90 - samples/sec: 1733.69 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:23:41,573 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:41,573 EPOCH 1 done: loss 1.0497 - lr: 0.000049
2023-10-13 08:23:42,353 DEV : loss 0.23358413577079773 - f1-score (micro avg) 0.712
2023-10-13 08:23:42,359 saving best model
2023-10-13 08:23:42,664 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:43,803 epoch 2 - iter 27/275 - loss 0.22629551 - time (sec): 1.14 - samples/sec: 2200.36 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:23:44,988 epoch 2 - iter 54/275 - loss 0.22270726 - time (sec): 2.32 - samples/sec: 1883.93 - lr: 0.000049 - momentum: 0.000000
2023-10-13 08:23:46,161 epoch 2 - iter 81/275 - loss 0.20667117 - time (sec): 3.50 - samples/sec: 1907.77 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:23:47,326 epoch 2 - iter 108/275 - loss 0.20192750 - time (sec): 4.66 - samples/sec: 1895.84 - lr: 0.000048 - momentum: 0.000000
2023-10-13 08:23:48,510 epoch 2 - iter 135/275 - loss 0.18672772 - time (sec): 5.85 - samples/sec: 1907.29 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:23:49,706 epoch 2 - iter 162/275 - loss 0.18416707 - time (sec): 7.04 - samples/sec: 1891.22 - lr: 0.000047 - momentum: 0.000000
2023-10-13 08:23:50,882 epoch 2 - iter 189/275 - loss 0.17608829 - time (sec): 8.22 - samples/sec: 1881.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:23:52,157 epoch 2 - iter 216/275 - loss 0.18436043 - time (sec): 9.49 - samples/sec: 1882.30 - lr: 0.000046 - momentum: 0.000000
2023-10-13 08:23:53,374 epoch 2 - iter 243/275 - loss 0.18599630 - time (sec): 10.71 - samples/sec: 1873.54 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:23:54,579 epoch 2 - iter 270/275 - loss 0.18136454 - time (sec): 11.91 - samples/sec: 1878.19 - lr: 0.000045 - momentum: 0.000000
2023-10-13 08:23:54,796 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:54,796 EPOCH 2 done: loss 0.1794 - lr: 0.000045
2023-10-13 08:23:55,430 DEV : loss 0.1404486447572708 - f1-score (micro avg) 0.8351
2023-10-13 08:23:55,435 saving best model
2023-10-13 08:23:55,832 ----------------------------------------------------------------------------------------------------
2023-10-13 08:23:57,062 epoch 3 - iter 27/275 - loss 0.13376844 - time (sec): 1.23 - samples/sec: 1664.30 - lr: 0.000044 - momentum: 0.000000
2023-10-13 08:23:58,325 epoch 3 - iter 54/275 - loss 0.12447582 - time (sec): 2.49 - samples/sec: 1721.00 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:23:59,513 epoch 3 - iter 81/275 - loss 0.12212165 - time (sec): 3.68 - samples/sec: 1816.92 - lr: 0.000043 - momentum: 0.000000
2023-10-13 08:24:00,638 epoch 3 - iter 108/275 - loss 0.12004262 - time (sec): 4.80 - samples/sec: 1865.63 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:24:01,789 epoch 3 - iter 135/275 - loss 0.11280775 - time (sec): 5.95 - samples/sec: 1881.82 - lr: 0.000042 - momentum: 0.000000
2023-10-13 08:24:02,919 epoch 3 - iter 162/275 - loss 0.11290164 - time (sec): 7.08 - samples/sec: 1872.88 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:24:04,064 epoch 3 - iter 189/275 - loss 0.10849768 - time (sec): 8.23 - samples/sec: 1893.46 - lr: 0.000041 - momentum: 0.000000
2023-10-13 08:24:05,201 epoch 3 - iter 216/275 - loss 0.10865497 - time (sec): 9.37 - samples/sec: 1876.12 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:24:06,339 epoch 3 - iter 243/275 - loss 0.11196281 - time (sec): 10.50 - samples/sec: 1901.51 - lr: 0.000040 - momentum: 0.000000
2023-10-13 08:24:07,481 epoch 3 - iter 270/275 - loss 0.11441067 - time (sec): 11.65 - samples/sec: 1920.62 - lr: 0.000039 - momentum: 0.000000
2023-10-13 08:24:07,696 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:07,696 EPOCH 3 done: loss 0.1137 - lr: 0.000039
2023-10-13 08:24:08,424 DEV : loss 0.15840192139148712 - f1-score (micro avg) 0.8302
2023-10-13 08:24:08,429 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:09,612 epoch 4 - iter 27/275 - loss 0.04289286 - time (sec): 1.18 - samples/sec: 1851.60 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:24:10,803 epoch 4 - iter 54/275 - loss 0.06359151 - time (sec): 2.37 - samples/sec: 1929.20 - lr: 0.000038 - momentum: 0.000000
2023-10-13 08:24:12,019 epoch 4 - iter 81/275 - loss 0.07865884 - time (sec): 3.59 - samples/sec: 1878.36 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:24:13,185 epoch 4 - iter 108/275 - loss 0.07465091 - time (sec): 4.75 - samples/sec: 1881.82 - lr: 0.000037 - momentum: 0.000000
2023-10-13 08:24:14,321 epoch 4 - iter 135/275 - loss 0.07631322 - time (sec): 5.89 - samples/sec: 1899.14 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:24:15,447 epoch 4 - iter 162/275 - loss 0.07773191 - time (sec): 7.02 - samples/sec: 1902.07 - lr: 0.000036 - momentum: 0.000000
2023-10-13 08:24:16,633 epoch 4 - iter 189/275 - loss 0.08141307 - time (sec): 8.20 - samples/sec: 1881.63 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:24:17,925 epoch 4 - iter 216/275 - loss 0.08450300 - time (sec): 9.50 - samples/sec: 1845.11 - lr: 0.000035 - momentum: 0.000000
2023-10-13 08:24:19,163 epoch 4 - iter 243/275 - loss 0.08201415 - time (sec): 10.73 - samples/sec: 1882.05 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:24:20,306 epoch 4 - iter 270/275 - loss 0.08401559 - time (sec): 11.88 - samples/sec: 1886.45 - lr: 0.000034 - momentum: 0.000000
2023-10-13 08:24:20,525 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:20,525 EPOCH 4 done: loss 0.0837 - lr: 0.000034
2023-10-13 08:24:21,177 DEV : loss 0.14714744687080383 - f1-score (micro avg) 0.8547
2023-10-13 08:24:21,182 saving best model
2023-10-13 08:24:21,610 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:22,819 epoch 5 - iter 27/275 - loss 0.05779460 - time (sec): 1.21 - samples/sec: 1938.10 - lr: 0.000033 - momentum: 0.000000
2023-10-13 08:24:24,023 epoch 5 - iter 54/275 - loss 0.05731164 - time (sec): 2.41 - samples/sec: 1900.43 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:24:25,225 epoch 5 - iter 81/275 - loss 0.08359211 - time (sec): 3.61 - samples/sec: 1842.00 - lr: 0.000032 - momentum: 0.000000
2023-10-13 08:24:26,431 epoch 5 - iter 108/275 - loss 0.07736091 - time (sec): 4.82 - samples/sec: 1829.20 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:24:27,694 epoch 5 - iter 135/275 - loss 0.07574006 - time (sec): 6.08 - samples/sec: 1834.90 - lr: 0.000031 - momentum: 0.000000
2023-10-13 08:24:29,106 epoch 5 - iter 162/275 - loss 0.06984948 - time (sec): 7.49 - samples/sec: 1784.72 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:24:30,362 epoch 5 - iter 189/275 - loss 0.06945974 - time (sec): 8.75 - samples/sec: 1786.23 - lr: 0.000030 - momentum: 0.000000
2023-10-13 08:24:31,586 epoch 5 - iter 216/275 - loss 0.06498528 - time (sec): 9.97 - samples/sec: 1764.83 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:24:32,793 epoch 5 - iter 243/275 - loss 0.06206434 - time (sec): 11.18 - samples/sec: 1778.79 - lr: 0.000029 - momentum: 0.000000
2023-10-13 08:24:34,068 epoch 5 - iter 270/275 - loss 0.05930514 - time (sec): 12.46 - samples/sec: 1784.40 - lr: 0.000028 - momentum: 0.000000
2023-10-13 08:24:34,302 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:34,303 EPOCH 5 done: loss 0.0589 - lr: 0.000028
2023-10-13 08:24:34,992 DEV : loss 0.1557360142469406 - f1-score (micro avg) 0.8764
2023-10-13 08:24:34,997 saving best model
2023-10-13 08:24:35,442 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:36,671 epoch 6 - iter 27/275 - loss 0.04213156 - time (sec): 1.23 - samples/sec: 1852.26 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:24:37,953 epoch 6 - iter 54/275 - loss 0.03746027 - time (sec): 2.51 - samples/sec: 1816.43 - lr: 0.000027 - momentum: 0.000000
2023-10-13 08:24:39,206 epoch 6 - iter 81/275 - loss 0.04423429 - time (sec): 3.76 - samples/sec: 1804.54 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:24:40,446 epoch 6 - iter 108/275 - loss 0.04302431 - time (sec): 5.00 - samples/sec: 1789.35 - lr: 0.000026 - momentum: 0.000000
2023-10-13 08:24:41,727 epoch 6 - iter 135/275 - loss 0.05133329 - time (sec): 6.28 - samples/sec: 1757.54 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:24:42,990 epoch 6 - iter 162/275 - loss 0.05113197 - time (sec): 7.55 - samples/sec: 1765.51 - lr: 0.000025 - momentum: 0.000000
2023-10-13 08:24:44,219 epoch 6 - iter 189/275 - loss 0.04766058 - time (sec): 8.78 - samples/sec: 1768.40 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:24:45,401 epoch 6 - iter 216/275 - loss 0.04578779 - time (sec): 9.96 - samples/sec: 1786.38 - lr: 0.000024 - momentum: 0.000000
2023-10-13 08:24:46,595 epoch 6 - iter 243/275 - loss 0.04590030 - time (sec): 11.15 - samples/sec: 1802.26 - lr: 0.000023 - momentum: 0.000000
2023-10-13 08:24:47,779 epoch 6 - iter 270/275 - loss 0.04431936 - time (sec): 12.34 - samples/sec: 1811.08 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:24:48,013 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:48,014 EPOCH 6 done: loss 0.0439 - lr: 0.000022
2023-10-13 08:24:48,683 DEV : loss 0.1554672122001648 - f1-score (micro avg) 0.8743
2023-10-13 08:24:48,688 ----------------------------------------------------------------------------------------------------
2023-10-13 08:24:49,910 epoch 7 - iter 27/275 - loss 0.00825223 - time (sec): 1.22 - samples/sec: 1803.21 - lr: 0.000022 - momentum: 0.000000
2023-10-13 08:24:51,120 epoch 7 - iter 54/275 - loss 0.01703077 - time (sec): 2.43 - samples/sec: 1833.99 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:24:52,305 epoch 7 - iter 81/275 - loss 0.01503006 - time (sec): 3.62 - samples/sec: 1784.71 - lr: 0.000021 - momentum: 0.000000
2023-10-13 08:24:53,481 epoch 7 - iter 108/275 - loss 0.01992351 - time (sec): 4.79 - samples/sec: 1859.00 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:24:54,668 epoch 7 - iter 135/275 - loss 0.02086434 - time (sec): 5.98 - samples/sec: 1889.01 - lr: 0.000020 - momentum: 0.000000
2023-10-13 08:24:55,847 epoch 7 - iter 162/275 - loss 0.02398405 - time (sec): 7.16 - samples/sec: 1861.64 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:24:57,072 epoch 7 - iter 189/275 - loss 0.02918174 - time (sec): 8.38 - samples/sec: 1867.97 - lr: 0.000019 - momentum: 0.000000
2023-10-13 08:24:58,265 epoch 7 - iter 216/275 - loss 0.03306324 - time (sec): 9.58 - samples/sec: 1879.74 - lr: 0.000018 - momentum: 0.000000
2023-10-13 08:24:59,466 epoch 7 - iter 243/275 - loss 0.03076299 - time (sec): 10.78 - samples/sec: 1864.56 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:25:00,706 epoch 7 - iter 270/275 - loss 0.02969565 - time (sec): 12.02 - samples/sec: 1859.25 - lr: 0.000017 - momentum: 0.000000
2023-10-13 08:25:00,936 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:00,936 EPOCH 7 done: loss 0.0296 - lr: 0.000017
2023-10-13 08:25:01,676 DEV : loss 0.16044297814369202 - f1-score (micro avg) 0.8674
2023-10-13 08:25:01,681 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:02,895 epoch 8 - iter 27/275 - loss 0.01298090 - time (sec): 1.21 - samples/sec: 1819.31 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:25:04,040 epoch 8 - iter 54/275 - loss 0.01025808 - time (sec): 2.36 - samples/sec: 1864.63 - lr: 0.000016 - momentum: 0.000000
2023-10-13 08:25:05,158 epoch 8 - iter 81/275 - loss 0.02586707 - time (sec): 3.48 - samples/sec: 1922.68 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:25:06,302 epoch 8 - iter 108/275 - loss 0.02302143 - time (sec): 4.62 - samples/sec: 1893.34 - lr: 0.000015 - momentum: 0.000000
2023-10-13 08:25:07,452 epoch 8 - iter 135/275 - loss 0.02504169 - time (sec): 5.77 - samples/sec: 1924.32 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:25:08,584 epoch 8 - iter 162/275 - loss 0.02809093 - time (sec): 6.90 - samples/sec: 1929.19 - lr: 0.000014 - momentum: 0.000000
2023-10-13 08:25:09,738 epoch 8 - iter 189/275 - loss 0.02567335 - time (sec): 8.06 - samples/sec: 1969.05 - lr: 0.000013 - momentum: 0.000000
2023-10-13 08:25:10,876 epoch 8 - iter 216/275 - loss 0.02470403 - time (sec): 9.19 - samples/sec: 1955.90 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:25:12,026 epoch 8 - iter 243/275 - loss 0.02352710 - time (sec): 10.34 - samples/sec: 1958.69 - lr: 0.000012 - momentum: 0.000000
2023-10-13 08:25:13,220 epoch 8 - iter 270/275 - loss 0.02178292 - time (sec): 11.54 - samples/sec: 1936.47 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:25:13,440 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:13,441 EPOCH 8 done: loss 0.0223 - lr: 0.000011
2023-10-13 08:25:14,105 DEV : loss 0.1707383543252945 - f1-score (micro avg) 0.8756
2023-10-13 08:25:14,110 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:15,267 epoch 9 - iter 27/275 - loss 0.00650434 - time (sec): 1.16 - samples/sec: 1932.08 - lr: 0.000011 - momentum: 0.000000
2023-10-13 08:25:16,425 epoch 9 - iter 54/275 - loss 0.01111044 - time (sec): 2.31 - samples/sec: 1917.46 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:25:17,602 epoch 9 - iter 81/275 - loss 0.01080575 - time (sec): 3.49 - samples/sec: 1910.45 - lr: 0.000010 - momentum: 0.000000
2023-10-13 08:25:18,771 epoch 9 - iter 108/275 - loss 0.01282098 - time (sec): 4.66 - samples/sec: 1925.84 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:25:19,914 epoch 9 - iter 135/275 - loss 0.01581696 - time (sec): 5.80 - samples/sec: 1961.44 - lr: 0.000009 - momentum: 0.000000
2023-10-13 08:25:21,172 epoch 9 - iter 162/275 - loss 0.01560980 - time (sec): 7.06 - samples/sec: 1918.53 - lr: 0.000008 - momentum: 0.000000
2023-10-13 08:25:22,356 epoch 9 - iter 189/275 - loss 0.01591003 - time (sec): 8.24 - samples/sec: 1898.82 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:25:23,524 epoch 9 - iter 216/275 - loss 0.01508954 - time (sec): 9.41 - samples/sec: 1920.66 - lr: 0.000007 - momentum: 0.000000
2023-10-13 08:25:24,711 epoch 9 - iter 243/275 - loss 0.01646518 - time (sec): 10.60 - samples/sec: 1908.94 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:25:25,918 epoch 9 - iter 270/275 - loss 0.01574494 - time (sec): 11.81 - samples/sec: 1893.24 - lr: 0.000006 - momentum: 0.000000
2023-10-13 08:25:26,130 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:26,131 EPOCH 9 done: loss 0.0155 - lr: 0.000006
2023-10-13 08:25:26,778 DEV : loss 0.16440145671367645 - f1-score (micro avg) 0.8822
2023-10-13 08:25:26,783 saving best model
2023-10-13 08:25:27,228 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:28,414 epoch 10 - iter 27/275 - loss 0.01136074 - time (sec): 1.18 - samples/sec: 2111.51 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:25:29,580 epoch 10 - iter 54/275 - loss 0.00715528 - time (sec): 2.35 - samples/sec: 1898.66 - lr: 0.000005 - momentum: 0.000000
2023-10-13 08:25:30,739 epoch 10 - iter 81/275 - loss 0.01069227 - time (sec): 3.51 - samples/sec: 1922.98 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:25:31,958 epoch 10 - iter 108/275 - loss 0.00857349 - time (sec): 4.73 - samples/sec: 1884.88 - lr: 0.000004 - momentum: 0.000000
2023-10-13 08:25:33,136 epoch 10 - iter 135/275 - loss 0.00833031 - time (sec): 5.91 - samples/sec: 1841.81 - lr: 0.000003 - momentum: 0.000000
2023-10-13 08:25:34,329 epoch 10 - iter 162/275 - loss 0.00906827 - time (sec): 7.10 - samples/sec: 1870.17 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:25:35,521 epoch 10 - iter 189/275 - loss 0.00820492 - time (sec): 8.29 - samples/sec: 1910.26 - lr: 0.000002 - momentum: 0.000000
2023-10-13 08:25:36,693 epoch 10 - iter 216/275 - loss 0.00800912 - time (sec): 9.46 - samples/sec: 1893.25 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:25:37,866 epoch 10 - iter 243/275 - loss 0.01035358 - time (sec): 10.64 - samples/sec: 1892.33 - lr: 0.000001 - momentum: 0.000000
2023-10-13 08:25:39,051 epoch 10 - iter 270/275 - loss 0.01116386 - time (sec): 11.82 - samples/sec: 1882.19 - lr: 0.000000 - momentum: 0.000000
2023-10-13 08:25:39,268 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:39,268 EPOCH 10 done: loss 0.0111 - lr: 0.000000
2023-10-13 08:25:39,961 DEV : loss 0.16304759681224823 - f1-score (micro avg) 0.8849
2023-10-13 08:25:39,966 saving best model
2023-10-13 08:25:40,728 ----------------------------------------------------------------------------------------------------
2023-10-13 08:25:40,729 Loading model from best epoch ...
2023-10-13 08:25:42,242 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-13 08:25:42,961
Results:
- F-score (micro) 0.9041
- F-score (macro) 0.8083
- Accuracy 0.8411
By class:
precision recall f1-score support
scope 0.8876 0.8977 0.8927 176
pers 0.9524 0.9375 0.9449 128
work 0.8767 0.8649 0.8707 74
object 1.0000 0.5000 0.6667 2
loc 1.0000 0.5000 0.6667 2
micro avg 0.9077 0.9005 0.9041 382
macro avg 0.9433 0.7400 0.8083 382
weighted avg 0.9084 0.9005 0.9035 382
2023-10-13 08:25:42,961 ----------------------------------------------------------------------------------------------------