2023-10-16 09:57:38,090 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,091 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 09:57:38,091 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Train: 7142 sentences 2023-10-16 09:57:38,092 (train_with_dev=False, train_with_test=False) 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Training Params: 2023-10-16 09:57:38,092 - learning_rate: "5e-05" 2023-10-16 09:57:38,092 - mini_batch_size: "4" 2023-10-16 09:57:38,092 - max_epochs: "10" 2023-10-16 09:57:38,092 - shuffle: "True" 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Plugins: 2023-10-16 09:57:38,092 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 09:57:38,092 - metric: "('micro avg', 'f1-score')" 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Computation: 2023-10-16 09:57:38,092 - compute on device: cuda:0 2023-10-16 09:57:38,092 - embedding storage: none 2023-10-16 09:57:38,092 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,092 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-16 09:57:38,093 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:38,093 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:57:46,981 epoch 1 - iter 178/1786 - loss 1.94742290 - time (sec): 8.89 - samples/sec: 2937.79 - lr: 0.000005 - momentum: 0.000000 2023-10-16 09:57:55,593 epoch 1 - iter 356/1786 - loss 1.24408736 - time (sec): 17.50 - samples/sec: 2890.19 - lr: 0.000010 - momentum: 0.000000 2023-10-16 09:58:04,150 epoch 1 - iter 534/1786 - loss 0.94076449 - time (sec): 26.06 - samples/sec: 2887.12 - lr: 0.000015 - momentum: 0.000000 2023-10-16 09:58:12,740 epoch 1 - iter 712/1786 - loss 0.77543030 - time (sec): 34.65 - samples/sec: 2872.30 - lr: 0.000020 - momentum: 0.000000 2023-10-16 09:58:21,321 epoch 1 - iter 890/1786 - loss 0.66770568 - time (sec): 43.23 - samples/sec: 2843.87 - lr: 0.000025 - momentum: 0.000000 2023-10-16 09:58:30,044 epoch 1 - iter 1068/1786 - loss 0.59121842 - time (sec): 51.95 - samples/sec: 2822.92 - lr: 0.000030 - momentum: 0.000000 2023-10-16 09:58:39,285 epoch 1 - iter 1246/1786 - loss 0.52805819 - time (sec): 61.19 - samples/sec: 2824.23 - lr: 0.000035 - momentum: 0.000000 2023-10-16 09:58:48,299 epoch 1 - iter 1424/1786 - loss 0.47799529 - time (sec): 70.21 - samples/sec: 2826.86 - lr: 0.000040 - momentum: 0.000000 2023-10-16 09:58:57,179 epoch 1 - iter 1602/1786 - loss 0.44205077 - time (sec): 79.09 - samples/sec: 2820.65 - lr: 0.000045 - momentum: 0.000000 2023-10-16 09:59:06,126 epoch 1 - iter 1780/1786 - loss 0.41271469 - time (sec): 88.03 - samples/sec: 2816.87 - lr: 0.000050 - momentum: 0.000000 2023-10-16 09:59:06,417 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:59:06,417 EPOCH 1 done: loss 0.4118 - lr: 0.000050 2023-10-16 09:59:09,619 DEV : loss 0.13136854767799377 - f1-score (micro avg) 0.6814 2023-10-16 09:59:09,638 saving best model 2023-10-16 09:59:10,047 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:59:19,266 epoch 2 - iter 178/1786 - loss 0.10867454 - time (sec): 9.22 - samples/sec: 2781.86 - lr: 0.000049 - momentum: 0.000000 2023-10-16 09:59:28,367 epoch 2 - iter 356/1786 - loss 0.11195066 - time (sec): 18.32 - samples/sec: 2755.44 - lr: 0.000049 - momentum: 0.000000 2023-10-16 09:59:37,194 epoch 2 - iter 534/1786 - loss 0.12303775 - time (sec): 27.14 - samples/sec: 2719.96 - lr: 0.000048 - momentum: 0.000000 2023-10-16 09:59:45,992 epoch 2 - iter 712/1786 - loss 0.12365229 - time (sec): 35.94 - samples/sec: 2757.21 - lr: 0.000048 - momentum: 0.000000 2023-10-16 09:59:54,863 epoch 2 - iter 890/1786 - loss 0.12778904 - time (sec): 44.81 - samples/sec: 2792.71 - lr: 0.000047 - momentum: 0.000000 2023-10-16 10:00:03,600 epoch 2 - iter 1068/1786 - loss 0.12649898 - time (sec): 53.55 - samples/sec: 2790.61 - lr: 0.000047 - momentum: 0.000000 2023-10-16 10:00:12,465 epoch 2 - iter 1246/1786 - loss 0.12771009 - time (sec): 62.42 - samples/sec: 2777.44 - lr: 0.000046 - momentum: 0.000000 2023-10-16 10:00:21,267 epoch 2 - iter 1424/1786 - loss 0.12802324 - time (sec): 71.22 - samples/sec: 2780.94 - lr: 0.000046 - momentum: 0.000000 2023-10-16 10:00:30,154 epoch 2 - iter 1602/1786 - loss 0.12740373 - time (sec): 80.10 - samples/sec: 2800.22 - lr: 0.000045 - momentum: 0.000000 2023-10-16 10:00:38,805 epoch 2 - iter 1780/1786 - loss 0.12594981 - time (sec): 88.76 - samples/sec: 2796.84 - lr: 0.000044 - momentum: 0.000000 2023-10-16 10:00:39,067 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:00:39,067 EPOCH 2 done: loss 0.1258 - lr: 0.000044 2023-10-16 10:00:43,284 DEV : loss 0.13792423903942108 - f1-score (micro avg) 0.7607 2023-10-16 10:00:43,300 saving best model 2023-10-16 10:00:44,350 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:00:53,160 epoch 3 - iter 178/1786 - loss 0.08110664 - time (sec): 8.81 - samples/sec: 2699.52 - lr: 0.000044 - momentum: 0.000000 2023-10-16 10:01:01,652 epoch 3 - iter 356/1786 - loss 0.08864351 - time (sec): 17.30 - samples/sec: 2858.43 - lr: 0.000043 - momentum: 0.000000 2023-10-16 10:01:10,237 epoch 3 - iter 534/1786 - loss 0.09228133 - time (sec): 25.89 - samples/sec: 2872.89 - lr: 0.000043 - momentum: 0.000000 2023-10-16 10:01:19,143 epoch 3 - iter 712/1786 - loss 0.09148226 - time (sec): 34.79 - samples/sec: 2895.87 - lr: 0.000042 - momentum: 0.000000 2023-10-16 10:01:28,075 epoch 3 - iter 890/1786 - loss 0.09059160 - time (sec): 43.72 - samples/sec: 2862.22 - lr: 0.000042 - momentum: 0.000000 2023-10-16 10:01:36,653 epoch 3 - iter 1068/1786 - loss 0.09136882 - time (sec): 52.30 - samples/sec: 2844.87 - lr: 0.000041 - momentum: 0.000000 2023-10-16 10:01:45,467 epoch 3 - iter 1246/1786 - loss 0.09087475 - time (sec): 61.12 - samples/sec: 2825.63 - lr: 0.000041 - momentum: 0.000000 2023-10-16 10:01:54,159 epoch 3 - iter 1424/1786 - loss 0.09230921 - time (sec): 69.81 - samples/sec: 2828.78 - lr: 0.000040 - momentum: 0.000000 2023-10-16 10:02:02,750 epoch 3 - iter 1602/1786 - loss 0.09190419 - time (sec): 78.40 - samples/sec: 2829.31 - lr: 0.000039 - momentum: 0.000000 2023-10-16 10:02:11,661 epoch 3 - iter 1780/1786 - loss 0.09100255 - time (sec): 87.31 - samples/sec: 2838.16 - lr: 0.000039 - momentum: 0.000000 2023-10-16 10:02:12,015 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:02:12,016 EPOCH 3 done: loss 0.0909 - lr: 0.000039 2023-10-16 10:02:16,212 DEV : loss 0.15914735198020935 - f1-score (micro avg) 0.7774 2023-10-16 10:02:16,228 saving best model 2023-10-16 10:02:16,730 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:02:25,530 epoch 4 - iter 178/1786 - loss 0.05552608 - time (sec): 8.80 - samples/sec: 2812.79 - lr: 0.000038 - momentum: 0.000000 2023-10-16 10:02:34,417 epoch 4 - iter 356/1786 - loss 0.06436274 - time (sec): 17.69 - samples/sec: 2799.71 - lr: 0.000038 - momentum: 0.000000 2023-10-16 10:02:43,378 epoch 4 - iter 534/1786 - loss 0.06377877 - time (sec): 26.65 - samples/sec: 2766.31 - lr: 0.000037 - momentum: 0.000000 2023-10-16 10:02:52,524 epoch 4 - iter 712/1786 - loss 0.06390155 - time (sec): 35.79 - samples/sec: 2794.47 - lr: 0.000037 - momentum: 0.000000 2023-10-16 10:03:01,362 epoch 4 - iter 890/1786 - loss 0.06628737 - time (sec): 44.63 - samples/sec: 2774.34 - lr: 0.000036 - momentum: 0.000000 2023-10-16 10:03:10,171 epoch 4 - iter 1068/1786 - loss 0.06610693 - time (sec): 53.44 - samples/sec: 2767.56 - lr: 0.000036 - momentum: 0.000000 2023-10-16 10:03:19,114 epoch 4 - iter 1246/1786 - loss 0.06561388 - time (sec): 62.38 - samples/sec: 2769.21 - lr: 0.000035 - momentum: 0.000000 2023-10-16 10:03:27,722 epoch 4 - iter 1424/1786 - loss 0.06713646 - time (sec): 70.99 - samples/sec: 2773.70 - lr: 0.000034 - momentum: 0.000000 2023-10-16 10:03:36,190 epoch 4 - iter 1602/1786 - loss 0.06817480 - time (sec): 79.46 - samples/sec: 2771.49 - lr: 0.000034 - momentum: 0.000000 2023-10-16 10:03:45,416 epoch 4 - iter 1780/1786 - loss 0.06843290 - time (sec): 88.69 - samples/sec: 2796.41 - lr: 0.000033 - momentum: 0.000000 2023-10-16 10:03:45,716 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:03:45,716 EPOCH 4 done: loss 0.0686 - lr: 0.000033 2023-10-16 10:03:50,390 DEV : loss 0.15915921330451965 - f1-score (micro avg) 0.7766 2023-10-16 10:03:50,406 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:03:59,089 epoch 5 - iter 178/1786 - loss 0.05022154 - time (sec): 8.68 - samples/sec: 2613.18 - lr: 0.000033 - momentum: 0.000000 2023-10-16 10:04:07,970 epoch 5 - iter 356/1786 - loss 0.05090393 - time (sec): 17.56 - samples/sec: 2741.71 - lr: 0.000032 - momentum: 0.000000 2023-10-16 10:04:16,988 epoch 5 - iter 534/1786 - loss 0.05106929 - time (sec): 26.58 - samples/sec: 2801.87 - lr: 0.000032 - momentum: 0.000000 2023-10-16 10:04:25,594 epoch 5 - iter 712/1786 - loss 0.04974481 - time (sec): 35.19 - samples/sec: 2783.34 - lr: 0.000031 - momentum: 0.000000 2023-10-16 10:04:34,524 epoch 5 - iter 890/1786 - loss 0.05195749 - time (sec): 44.12 - samples/sec: 2808.00 - lr: 0.000031 - momentum: 0.000000 2023-10-16 10:04:43,144 epoch 5 - iter 1068/1786 - loss 0.05146219 - time (sec): 52.74 - samples/sec: 2780.35 - lr: 0.000030 - momentum: 0.000000 2023-10-16 10:04:52,122 epoch 5 - iter 1246/1786 - loss 0.05143701 - time (sec): 61.72 - samples/sec: 2790.81 - lr: 0.000029 - momentum: 0.000000 2023-10-16 10:05:01,020 epoch 5 - iter 1424/1786 - loss 0.05105708 - time (sec): 70.61 - samples/sec: 2786.63 - lr: 0.000029 - momentum: 0.000000 2023-10-16 10:05:09,953 epoch 5 - iter 1602/1786 - loss 0.05263520 - time (sec): 79.55 - samples/sec: 2805.39 - lr: 0.000028 - momentum: 0.000000 2023-10-16 10:05:18,626 epoch 5 - iter 1780/1786 - loss 0.05290404 - time (sec): 88.22 - samples/sec: 2811.00 - lr: 0.000028 - momentum: 0.000000 2023-10-16 10:05:18,934 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:05:18,935 EPOCH 5 done: loss 0.0529 - lr: 0.000028 2023-10-16 10:05:23,061 DEV : loss 0.20985299348831177 - f1-score (micro avg) 0.7812 2023-10-16 10:05:23,077 saving best model 2023-10-16 10:05:23,560 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:05:32,536 epoch 6 - iter 178/1786 - loss 0.04741637 - time (sec): 8.97 - samples/sec: 2965.00 - lr: 0.000027 - momentum: 0.000000 2023-10-16 10:05:41,275 epoch 6 - iter 356/1786 - loss 0.04275169 - time (sec): 17.71 - samples/sec: 2872.22 - lr: 0.000027 - momentum: 0.000000 2023-10-16 10:05:50,210 epoch 6 - iter 534/1786 - loss 0.03942651 - time (sec): 26.65 - samples/sec: 2818.41 - lr: 0.000026 - momentum: 0.000000 2023-10-16 10:05:58,662 epoch 6 - iter 712/1786 - loss 0.04059826 - time (sec): 35.10 - samples/sec: 2830.73 - lr: 0.000026 - momentum: 0.000000 2023-10-16 10:06:07,484 epoch 6 - iter 890/1786 - loss 0.04004597 - time (sec): 43.92 - samples/sec: 2814.49 - lr: 0.000025 - momentum: 0.000000 2023-10-16 10:06:16,218 epoch 6 - iter 1068/1786 - loss 0.03891575 - time (sec): 52.66 - samples/sec: 2844.00 - lr: 0.000024 - momentum: 0.000000 2023-10-16 10:06:24,675 epoch 6 - iter 1246/1786 - loss 0.03960494 - time (sec): 61.11 - samples/sec: 2830.47 - lr: 0.000024 - momentum: 0.000000 2023-10-16 10:06:33,645 epoch 6 - iter 1424/1786 - loss 0.03949478 - time (sec): 70.08 - samples/sec: 2816.99 - lr: 0.000023 - momentum: 0.000000 2023-10-16 10:06:43,058 epoch 6 - iter 1602/1786 - loss 0.03917810 - time (sec): 79.50 - samples/sec: 2797.23 - lr: 0.000023 - momentum: 0.000000 2023-10-16 10:06:51,989 epoch 6 - iter 1780/1786 - loss 0.04050214 - time (sec): 88.43 - samples/sec: 2802.76 - lr: 0.000022 - momentum: 0.000000 2023-10-16 10:06:52,248 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:06:52,248 EPOCH 6 done: loss 0.0405 - lr: 0.000022 2023-10-16 10:06:56,411 DEV : loss 0.1887538582086563 - f1-score (micro avg) 0.7869 2023-10-16 10:06:56,427 saving best model 2023-10-16 10:06:56,933 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:07:05,796 epoch 7 - iter 178/1786 - loss 0.03361585 - time (sec): 8.86 - samples/sec: 2795.40 - lr: 0.000022 - momentum: 0.000000 2023-10-16 10:07:14,528 epoch 7 - iter 356/1786 - loss 0.03329973 - time (sec): 17.59 - samples/sec: 2848.14 - lr: 0.000021 - momentum: 0.000000 2023-10-16 10:07:23,285 epoch 7 - iter 534/1786 - loss 0.03326424 - time (sec): 26.35 - samples/sec: 2855.10 - lr: 0.000021 - momentum: 0.000000 2023-10-16 10:07:31,995 epoch 7 - iter 712/1786 - loss 0.03160995 - time (sec): 35.06 - samples/sec: 2818.73 - lr: 0.000020 - momentum: 0.000000 2023-10-16 10:07:40,594 epoch 7 - iter 890/1786 - loss 0.03005408 - time (sec): 43.66 - samples/sec: 2835.09 - lr: 0.000019 - momentum: 0.000000 2023-10-16 10:07:49,327 epoch 7 - iter 1068/1786 - loss 0.03084739 - time (sec): 52.39 - samples/sec: 2832.70 - lr: 0.000019 - momentum: 0.000000 2023-10-16 10:07:58,392 epoch 7 - iter 1246/1786 - loss 0.03032454 - time (sec): 61.45 - samples/sec: 2831.30 - lr: 0.000018 - momentum: 0.000000 2023-10-16 10:08:06,931 epoch 7 - iter 1424/1786 - loss 0.03088217 - time (sec): 69.99 - samples/sec: 2818.38 - lr: 0.000018 - momentum: 0.000000 2023-10-16 10:08:15,826 epoch 7 - iter 1602/1786 - loss 0.03073768 - time (sec): 78.89 - samples/sec: 2831.34 - lr: 0.000017 - momentum: 0.000000 2023-10-16 10:08:24,577 epoch 7 - iter 1780/1786 - loss 0.03023599 - time (sec): 87.64 - samples/sec: 2829.60 - lr: 0.000017 - momentum: 0.000000 2023-10-16 10:08:24,844 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:08:24,844 EPOCH 7 done: loss 0.0302 - lr: 0.000017 2023-10-16 10:08:29,610 DEV : loss 0.2058480978012085 - f1-score (micro avg) 0.7722 2023-10-16 10:08:29,627 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:08:38,694 epoch 8 - iter 178/1786 - loss 0.04352804 - time (sec): 9.07 - samples/sec: 2988.38 - lr: 0.000016 - momentum: 0.000000 2023-10-16 10:08:47,551 epoch 8 - iter 356/1786 - loss 0.03156578 - time (sec): 17.92 - samples/sec: 2921.28 - lr: 0.000016 - momentum: 0.000000 2023-10-16 10:08:56,308 epoch 8 - iter 534/1786 - loss 0.02706534 - time (sec): 26.68 - samples/sec: 2895.29 - lr: 0.000015 - momentum: 0.000000 2023-10-16 10:09:04,820 epoch 8 - iter 712/1786 - loss 0.02702835 - time (sec): 35.19 - samples/sec: 2867.00 - lr: 0.000014 - momentum: 0.000000 2023-10-16 10:09:13,393 epoch 8 - iter 890/1786 - loss 0.02599787 - time (sec): 43.76 - samples/sec: 2854.41 - lr: 0.000014 - momentum: 0.000000 2023-10-16 10:09:22,260 epoch 8 - iter 1068/1786 - loss 0.02445266 - time (sec): 52.63 - samples/sec: 2819.22 - lr: 0.000013 - momentum: 0.000000 2023-10-16 10:09:31,228 epoch 8 - iter 1246/1786 - loss 0.02430149 - time (sec): 61.60 - samples/sec: 2840.93 - lr: 0.000013 - momentum: 0.000000 2023-10-16 10:09:40,085 epoch 8 - iter 1424/1786 - loss 0.02378568 - time (sec): 70.46 - samples/sec: 2827.21 - lr: 0.000012 - momentum: 0.000000 2023-10-16 10:09:48,717 epoch 8 - iter 1602/1786 - loss 0.02346589 - time (sec): 79.09 - samples/sec: 2807.17 - lr: 0.000012 - momentum: 0.000000 2023-10-16 10:09:57,557 epoch 8 - iter 1780/1786 - loss 0.02359126 - time (sec): 87.93 - samples/sec: 2821.74 - lr: 0.000011 - momentum: 0.000000 2023-10-16 10:09:57,838 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:09:57,839 EPOCH 8 done: loss 0.0235 - lr: 0.000011 2023-10-16 10:10:02,026 DEV : loss 0.20242059230804443 - f1-score (micro avg) 0.8011 2023-10-16 10:10:02,043 saving best model 2023-10-16 10:10:02,569 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:10:11,249 epoch 9 - iter 178/1786 - loss 0.02199362 - time (sec): 8.67 - samples/sec: 2936.70 - lr: 0.000011 - momentum: 0.000000 2023-10-16 10:10:19,860 epoch 9 - iter 356/1786 - loss 0.01834317 - time (sec): 17.29 - samples/sec: 2843.56 - lr: 0.000010 - momentum: 0.000000 2023-10-16 10:10:28,500 epoch 9 - iter 534/1786 - loss 0.01636593 - time (sec): 25.93 - samples/sec: 2848.80 - lr: 0.000009 - momentum: 0.000000 2023-10-16 10:10:37,237 epoch 9 - iter 712/1786 - loss 0.01554274 - time (sec): 34.66 - samples/sec: 2869.85 - lr: 0.000009 - momentum: 0.000000 2023-10-16 10:10:45,796 epoch 9 - iter 890/1786 - loss 0.01668982 - time (sec): 43.22 - samples/sec: 2843.23 - lr: 0.000008 - momentum: 0.000000 2023-10-16 10:10:54,295 epoch 9 - iter 1068/1786 - loss 0.01736686 - time (sec): 51.72 - samples/sec: 2846.50 - lr: 0.000008 - momentum: 0.000000 2023-10-16 10:11:03,452 epoch 9 - iter 1246/1786 - loss 0.01717748 - time (sec): 60.88 - samples/sec: 2822.73 - lr: 0.000007 - momentum: 0.000000 2023-10-16 10:11:12,044 epoch 9 - iter 1424/1786 - loss 0.01657150 - time (sec): 69.47 - samples/sec: 2830.65 - lr: 0.000007 - momentum: 0.000000 2023-10-16 10:11:20,876 epoch 9 - iter 1602/1786 - loss 0.01682811 - time (sec): 78.30 - samples/sec: 2823.29 - lr: 0.000006 - momentum: 0.000000 2023-10-16 10:11:29,889 epoch 9 - iter 1780/1786 - loss 0.01701446 - time (sec): 87.31 - samples/sec: 2839.70 - lr: 0.000006 - momentum: 0.000000 2023-10-16 10:11:30,162 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:11:30,163 EPOCH 9 done: loss 0.0170 - lr: 0.000006 2023-10-16 10:11:34,352 DEV : loss 0.20268410444259644 - f1-score (micro avg) 0.7997 2023-10-16 10:11:34,368 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:11:43,098 epoch 10 - iter 178/1786 - loss 0.01040220 - time (sec): 8.73 - samples/sec: 2763.04 - lr: 0.000005 - momentum: 0.000000 2023-10-16 10:11:51,917 epoch 10 - iter 356/1786 - loss 0.00841270 - time (sec): 17.55 - samples/sec: 2818.53 - lr: 0.000004 - momentum: 0.000000 2023-10-16 10:12:00,800 epoch 10 - iter 534/1786 - loss 0.01266186 - time (sec): 26.43 - samples/sec: 2818.85 - lr: 0.000004 - momentum: 0.000000 2023-10-16 10:12:09,453 epoch 10 - iter 712/1786 - loss 0.01314607 - time (sec): 35.08 - samples/sec: 2838.64 - lr: 0.000003 - momentum: 0.000000 2023-10-16 10:12:18,110 epoch 10 - iter 890/1786 - loss 0.01322097 - time (sec): 43.74 - samples/sec: 2848.20 - lr: 0.000003 - momentum: 0.000000 2023-10-16 10:12:26,982 epoch 10 - iter 1068/1786 - loss 0.01308993 - time (sec): 52.61 - samples/sec: 2851.51 - lr: 0.000002 - momentum: 0.000000 2023-10-16 10:12:35,888 epoch 10 - iter 1246/1786 - loss 0.01241075 - time (sec): 61.52 - samples/sec: 2855.40 - lr: 0.000002 - momentum: 0.000000 2023-10-16 10:12:44,553 epoch 10 - iter 1424/1786 - loss 0.01264047 - time (sec): 70.18 - samples/sec: 2847.46 - lr: 0.000001 - momentum: 0.000000 2023-10-16 10:12:53,476 epoch 10 - iter 1602/1786 - loss 0.01178041 - time (sec): 79.11 - samples/sec: 2839.80 - lr: 0.000001 - momentum: 0.000000 2023-10-16 10:13:02,172 epoch 10 - iter 1780/1786 - loss 0.01160785 - time (sec): 87.80 - samples/sec: 2826.86 - lr: 0.000000 - momentum: 0.000000 2023-10-16 10:13:02,437 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:13:02,438 EPOCH 10 done: loss 0.0116 - lr: 0.000000 2023-10-16 10:13:07,862 DEV : loss 0.207749143242836 - f1-score (micro avg) 0.8078 2023-10-16 10:13:07,880 saving best model 2023-10-16 10:13:09,016 ---------------------------------------------------------------------------------------------------- 2023-10-16 10:13:09,018 Loading model from best epoch ... 2023-10-16 10:13:11,262 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 10:13:21,169 Results: - F-score (micro) 0.6891 - F-score (macro) 0.607 - Accuracy 0.5469 By class: precision recall f1-score support LOC 0.7334 0.6758 0.7034 1095 PER 0.7644 0.7727 0.7686 1012 ORG 0.4248 0.5462 0.4779 357 HumanProd 0.3729 0.6667 0.4783 33 micro avg 0.6820 0.6964 0.6891 2497 macro avg 0.5739 0.6654 0.6070 2497 weighted avg 0.6971 0.6964 0.6946 2497 2023-10-16 10:13:21,170 ----------------------------------------------------------------------------------------------------