|
2023-10-16 18:10:18,197 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,198 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:10:18,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,198 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:10:18,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,198 Train: 1166 sentences |
|
2023-10-16 18:10:18,198 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:10:18,198 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,198 Training Params: |
|
2023-10-16 18:10:18,198 - learning_rate: "5e-05" |
|
2023-10-16 18:10:18,199 - mini_batch_size: "4" |
|
2023-10-16 18:10:18,199 - max_epochs: "10" |
|
2023-10-16 18:10:18,199 - shuffle: "True" |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,199 Plugins: |
|
2023-10-16 18:10:18,199 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,199 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:10:18,199 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,199 Computation: |
|
2023-10-16 18:10:18,199 - compute on device: cuda:0 |
|
2023-10-16 18:10:18,199 - embedding storage: none |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,199 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:18,199 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:19,934 epoch 1 - iter 29/292 - loss 2.79895374 - time (sec): 1.73 - samples/sec: 2999.61 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:10:21,440 epoch 1 - iter 58/292 - loss 2.26112408 - time (sec): 3.24 - samples/sec: 2784.64 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:10:23,061 epoch 1 - iter 87/292 - loss 1.73551032 - time (sec): 4.86 - samples/sec: 2718.87 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:10:24,560 epoch 1 - iter 116/292 - loss 1.45381972 - time (sec): 6.36 - samples/sec: 2708.23 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:10:26,120 epoch 1 - iter 145/292 - loss 1.25529010 - time (sec): 7.92 - samples/sec: 2667.70 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:10:27,792 epoch 1 - iter 174/292 - loss 1.11428781 - time (sec): 9.59 - samples/sec: 2628.59 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:10:29,494 epoch 1 - iter 203/292 - loss 0.96722340 - time (sec): 11.29 - samples/sec: 2701.00 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:10:31,124 epoch 1 - iter 232/292 - loss 0.88860237 - time (sec): 12.92 - samples/sec: 2696.10 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:10:32,977 epoch 1 - iter 261/292 - loss 0.82506496 - time (sec): 14.78 - samples/sec: 2709.72 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:10:34,518 epoch 1 - iter 290/292 - loss 0.77331968 - time (sec): 16.32 - samples/sec: 2702.82 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:10:34,621 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:34,621 EPOCH 1 done: loss 0.7690 - lr: 0.000049 |
|
2023-10-16 18:10:35,854 DEV : loss 0.2004041224718094 - f1-score (micro avg) 0.4475 |
|
2023-10-16 18:10:35,860 saving best model |
|
2023-10-16 18:10:36,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:37,956 epoch 2 - iter 29/292 - loss 0.21938164 - time (sec): 1.61 - samples/sec: 2636.68 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:10:39,534 epoch 2 - iter 58/292 - loss 0.21422136 - time (sec): 3.18 - samples/sec: 2620.74 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:10:41,136 epoch 2 - iter 87/292 - loss 0.20674539 - time (sec): 4.79 - samples/sec: 2593.64 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:10:42,738 epoch 2 - iter 116/292 - loss 0.20624764 - time (sec): 6.39 - samples/sec: 2603.98 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:10:44,384 epoch 2 - iter 145/292 - loss 0.20198539 - time (sec): 8.03 - samples/sec: 2617.16 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:10:46,189 epoch 2 - iter 174/292 - loss 0.19700909 - time (sec): 9.84 - samples/sec: 2680.82 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:10:47,868 epoch 2 - iter 203/292 - loss 0.19421637 - time (sec): 11.52 - samples/sec: 2710.40 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:10:49,486 epoch 2 - iter 232/292 - loss 0.19245480 - time (sec): 13.14 - samples/sec: 2721.24 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:10:51,068 epoch 2 - iter 261/292 - loss 0.19167829 - time (sec): 14.72 - samples/sec: 2715.53 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:10:52,706 epoch 2 - iter 290/292 - loss 0.18734658 - time (sec): 16.36 - samples/sec: 2710.77 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:10:52,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:52,798 EPOCH 2 done: loss 0.1871 - lr: 0.000045 |
|
2023-10-16 18:10:54,087 DEV : loss 0.14392311871051788 - f1-score (micro avg) 0.6274 |
|
2023-10-16 18:10:54,093 saving best model |
|
2023-10-16 18:10:54,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:10:56,362 epoch 3 - iter 29/292 - loss 0.12525001 - time (sec): 1.72 - samples/sec: 2533.56 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 18:10:57,901 epoch 3 - iter 58/292 - loss 0.10623419 - time (sec): 3.26 - samples/sec: 2703.74 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:10:59,570 epoch 3 - iter 87/292 - loss 0.11598220 - time (sec): 4.93 - samples/sec: 2733.37 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:11:01,071 epoch 3 - iter 116/292 - loss 0.10778250 - time (sec): 6.43 - samples/sec: 2688.67 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:11:02,671 epoch 3 - iter 145/292 - loss 0.10257617 - time (sec): 8.03 - samples/sec: 2689.78 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:11:04,286 epoch 3 - iter 174/292 - loss 0.10392959 - time (sec): 9.64 - samples/sec: 2673.15 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:11:06,119 epoch 3 - iter 203/292 - loss 0.10728673 - time (sec): 11.48 - samples/sec: 2706.88 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:11:07,780 epoch 3 - iter 232/292 - loss 0.10431397 - time (sec): 13.14 - samples/sec: 2705.06 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:11:09,340 epoch 3 - iter 261/292 - loss 0.10240348 - time (sec): 14.70 - samples/sec: 2705.87 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:11:11,150 epoch 3 - iter 290/292 - loss 0.10140412 - time (sec): 16.51 - samples/sec: 2682.50 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 18:11:11,238 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:11,239 EPOCH 3 done: loss 0.1013 - lr: 0.000039 |
|
2023-10-16 18:11:12,792 DEV : loss 0.15076522529125214 - f1-score (micro avg) 0.6866 |
|
2023-10-16 18:11:12,799 saving best model |
|
2023-10-16 18:11:13,343 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:15,173 epoch 4 - iter 29/292 - loss 0.06366572 - time (sec): 1.83 - samples/sec: 2791.55 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:11:16,746 epoch 4 - iter 58/292 - loss 0.08419407 - time (sec): 3.40 - samples/sec: 2791.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:11:18,363 epoch 4 - iter 87/292 - loss 0.08091394 - time (sec): 5.02 - samples/sec: 2760.46 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:11:19,994 epoch 4 - iter 116/292 - loss 0.07811546 - time (sec): 6.65 - samples/sec: 2775.52 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:11:21,738 epoch 4 - iter 145/292 - loss 0.07285773 - time (sec): 8.39 - samples/sec: 2806.00 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:11:23,417 epoch 4 - iter 174/292 - loss 0.07228673 - time (sec): 10.07 - samples/sec: 2796.23 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:11:24,950 epoch 4 - iter 203/292 - loss 0.07173903 - time (sec): 11.61 - samples/sec: 2790.55 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:11:26,623 epoch 4 - iter 232/292 - loss 0.06922635 - time (sec): 13.28 - samples/sec: 2733.83 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:11:28,237 epoch 4 - iter 261/292 - loss 0.06934773 - time (sec): 14.89 - samples/sec: 2736.98 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:11:29,735 epoch 4 - iter 290/292 - loss 0.06709901 - time (sec): 16.39 - samples/sec: 2704.16 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:11:29,823 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:29,823 EPOCH 4 done: loss 0.0669 - lr: 0.000033 |
|
2023-10-16 18:11:31,187 DEV : loss 0.12698547542095184 - f1-score (micro avg) 0.7442 |
|
2023-10-16 18:11:31,194 saving best model |
|
2023-10-16 18:11:31,738 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:33,348 epoch 5 - iter 29/292 - loss 0.03971967 - time (sec): 1.61 - samples/sec: 2971.03 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:11:35,043 epoch 5 - iter 58/292 - loss 0.04349609 - time (sec): 3.30 - samples/sec: 2862.25 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:11:36,866 epoch 5 - iter 87/292 - loss 0.04345885 - time (sec): 5.13 - samples/sec: 2838.22 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:11:38,470 epoch 5 - iter 116/292 - loss 0.04052249 - time (sec): 6.73 - samples/sec: 2806.53 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:11:39,989 epoch 5 - iter 145/292 - loss 0.04240124 - time (sec): 8.25 - samples/sec: 2770.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:11:41,530 epoch 5 - iter 174/292 - loss 0.04137216 - time (sec): 9.79 - samples/sec: 2722.46 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:11:43,122 epoch 5 - iter 203/292 - loss 0.04406993 - time (sec): 11.38 - samples/sec: 2693.47 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:11:44,829 epoch 5 - iter 232/292 - loss 0.04965846 - time (sec): 13.09 - samples/sec: 2687.74 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:11:46,495 epoch 5 - iter 261/292 - loss 0.04992373 - time (sec): 14.75 - samples/sec: 2651.24 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:11:48,221 epoch 5 - iter 290/292 - loss 0.04923535 - time (sec): 16.48 - samples/sec: 2684.64 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:11:48,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:48,307 EPOCH 5 done: loss 0.0490 - lr: 0.000028 |
|
2023-10-16 18:11:49,647 DEV : loss 0.12098459899425507 - f1-score (micro avg) 0.7489 |
|
2023-10-16 18:11:49,654 saving best model |
|
2023-10-16 18:11:50,251 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:11:51,792 epoch 6 - iter 29/292 - loss 0.02893335 - time (sec): 1.54 - samples/sec: 2489.86 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:11:53,344 epoch 6 - iter 58/292 - loss 0.02571484 - time (sec): 3.09 - samples/sec: 2674.75 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:11:54,891 epoch 6 - iter 87/292 - loss 0.02231029 - time (sec): 4.64 - samples/sec: 2649.66 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:11:56,383 epoch 6 - iter 116/292 - loss 0.02316900 - time (sec): 6.13 - samples/sec: 2686.67 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:11:58,105 epoch 6 - iter 145/292 - loss 0.02370091 - time (sec): 7.85 - samples/sec: 2692.09 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:11:59,533 epoch 6 - iter 174/292 - loss 0.02575667 - time (sec): 9.28 - samples/sec: 2668.33 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:12:01,236 epoch 6 - iter 203/292 - loss 0.02844555 - time (sec): 10.98 - samples/sec: 2674.47 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:12:02,979 epoch 6 - iter 232/292 - loss 0.02883653 - time (sec): 12.73 - samples/sec: 2708.12 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:12:04,682 epoch 6 - iter 261/292 - loss 0.03382284 - time (sec): 14.43 - samples/sec: 2738.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:12:06,372 epoch 6 - iter 290/292 - loss 0.03316595 - time (sec): 16.12 - samples/sec: 2743.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:12:06,460 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:06,461 EPOCH 6 done: loss 0.0333 - lr: 0.000022 |
|
2023-10-16 18:12:07,758 DEV : loss 0.1667584925889969 - f1-score (micro avg) 0.7421 |
|
2023-10-16 18:12:07,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:09,375 epoch 7 - iter 29/292 - loss 0.03171726 - time (sec): 1.61 - samples/sec: 2546.40 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:12:11,125 epoch 7 - iter 58/292 - loss 0.02587482 - time (sec): 3.36 - samples/sec: 2771.90 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:12:12,838 epoch 7 - iter 87/292 - loss 0.02438651 - time (sec): 5.07 - samples/sec: 2792.89 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:12:14,619 epoch 7 - iter 116/292 - loss 0.02407190 - time (sec): 6.85 - samples/sec: 2743.58 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:12:16,281 epoch 7 - iter 145/292 - loss 0.02700205 - time (sec): 8.52 - samples/sec: 2689.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:12:17,822 epoch 7 - iter 174/292 - loss 0.02378621 - time (sec): 10.06 - samples/sec: 2678.06 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:12:19,362 epoch 7 - iter 203/292 - loss 0.02200729 - time (sec): 11.60 - samples/sec: 2665.80 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:12:21,020 epoch 7 - iter 232/292 - loss 0.02317887 - time (sec): 13.26 - samples/sec: 2696.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:12:22,569 epoch 7 - iter 261/292 - loss 0.02212980 - time (sec): 14.80 - samples/sec: 2682.01 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:12:24,250 epoch 7 - iter 290/292 - loss 0.02749476 - time (sec): 16.49 - samples/sec: 2678.18 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:12:24,360 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:24,360 EPOCH 7 done: loss 0.0273 - lr: 0.000017 |
|
2023-10-16 18:12:25,635 DEV : loss 0.1895621120929718 - f1-score (micro avg) 0.6967 |
|
2023-10-16 18:12:25,640 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:27,412 epoch 8 - iter 29/292 - loss 0.02580784 - time (sec): 1.77 - samples/sec: 2688.35 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:12:28,839 epoch 8 - iter 58/292 - loss 0.02256540 - time (sec): 3.20 - samples/sec: 2496.98 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:12:30,653 epoch 8 - iter 87/292 - loss 0.02010110 - time (sec): 5.01 - samples/sec: 2508.74 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:12:32,605 epoch 8 - iter 116/292 - loss 0.02032594 - time (sec): 6.96 - samples/sec: 2490.68 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:12:34,168 epoch 8 - iter 145/292 - loss 0.02114831 - time (sec): 8.53 - samples/sec: 2560.79 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:12:35,818 epoch 8 - iter 174/292 - loss 0.01900158 - time (sec): 10.18 - samples/sec: 2625.35 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:12:37,611 epoch 8 - iter 203/292 - loss 0.01807571 - time (sec): 11.97 - samples/sec: 2670.01 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:12:39,215 epoch 8 - iter 232/292 - loss 0.02007551 - time (sec): 13.57 - samples/sec: 2687.76 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:12:40,645 epoch 8 - iter 261/292 - loss 0.01934112 - time (sec): 15.00 - samples/sec: 2669.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:12:42,184 epoch 8 - iter 290/292 - loss 0.01808974 - time (sec): 16.54 - samples/sec: 2673.29 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:12:42,278 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:42,279 EPOCH 8 done: loss 0.0181 - lr: 0.000011 |
|
2023-10-16 18:12:43,523 DEV : loss 0.15801307559013367 - f1-score (micro avg) 0.7722 |
|
2023-10-16 18:12:43,527 saving best model |
|
2023-10-16 18:12:44,050 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:12:45,848 epoch 9 - iter 29/292 - loss 0.00653241 - time (sec): 1.79 - samples/sec: 3042.49 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:12:47,403 epoch 9 - iter 58/292 - loss 0.01637186 - time (sec): 3.35 - samples/sec: 2787.34 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:12:49,039 epoch 9 - iter 87/292 - loss 0.02178710 - time (sec): 4.98 - samples/sec: 2806.55 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:12:50,583 epoch 9 - iter 116/292 - loss 0.01938170 - time (sec): 6.53 - samples/sec: 2840.27 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:12:52,201 epoch 9 - iter 145/292 - loss 0.01644012 - time (sec): 8.15 - samples/sec: 2786.41 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:12:53,750 epoch 9 - iter 174/292 - loss 0.01589825 - time (sec): 9.70 - samples/sec: 2747.03 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:12:55,369 epoch 9 - iter 203/292 - loss 0.01495786 - time (sec): 11.31 - samples/sec: 2724.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:12:57,101 epoch 9 - iter 232/292 - loss 0.01534385 - time (sec): 13.05 - samples/sec: 2714.09 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:12:58,778 epoch 9 - iter 261/292 - loss 0.01426417 - time (sec): 14.72 - samples/sec: 2712.16 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:13:00,480 epoch 9 - iter 290/292 - loss 0.01426577 - time (sec): 16.43 - samples/sec: 2698.84 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:13:00,568 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:13:00,568 EPOCH 9 done: loss 0.0142 - lr: 0.000006 |
|
2023-10-16 18:13:01,879 DEV : loss 0.1797230988740921 - f1-score (micro avg) 0.7197 |
|
2023-10-16 18:13:01,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:13:03,569 epoch 10 - iter 29/292 - loss 0.01425186 - time (sec): 1.68 - samples/sec: 2499.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:13:05,122 epoch 10 - iter 58/292 - loss 0.00873565 - time (sec): 3.23 - samples/sec: 2542.70 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:13:06,726 epoch 10 - iter 87/292 - loss 0.00939751 - time (sec): 4.84 - samples/sec: 2603.80 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:13:08,464 epoch 10 - iter 116/292 - loss 0.00986969 - time (sec): 6.57 - samples/sec: 2639.89 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:13:10,085 epoch 10 - iter 145/292 - loss 0.00847025 - time (sec): 8.20 - samples/sec: 2600.30 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:13:11,755 epoch 10 - iter 174/292 - loss 0.00994205 - time (sec): 9.87 - samples/sec: 2626.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:13:13,283 epoch 10 - iter 203/292 - loss 0.01008661 - time (sec): 11.39 - samples/sec: 2646.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:13:15,084 epoch 10 - iter 232/292 - loss 0.00916347 - time (sec): 13.19 - samples/sec: 2628.55 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:13:16,666 epoch 10 - iter 261/292 - loss 0.00946923 - time (sec): 14.78 - samples/sec: 2637.62 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:13:18,415 epoch 10 - iter 290/292 - loss 0.01083818 - time (sec): 16.53 - samples/sec: 2675.26 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:13:18,514 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:13:18,514 EPOCH 10 done: loss 0.0114 - lr: 0.000000 |
|
2023-10-16 18:13:19,807 DEV : loss 0.1740553081035614 - f1-score (micro avg) 0.7426 |
|
2023-10-16 18:13:20,260 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:13:20,261 Loading model from best epoch ... |
|
2023-10-16 18:13:21,923 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:13:24,313 |
|
Results: |
|
- F-score (micro) 0.7541 |
|
- F-score (macro) 0.6943 |
|
- Accuracy 0.6298 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8169 0.8075 0.8121 348 |
|
LOC 0.6333 0.8736 0.7343 261 |
|
ORG 0.4865 0.3462 0.4045 52 |
|
HumanProd 0.7917 0.8636 0.8261 22 |
|
|
|
micro avg 0.7137 0.7994 0.7541 683 |
|
macro avg 0.6821 0.7227 0.6943 683 |
|
weighted avg 0.7208 0.7994 0.7518 683 |
|
|
|
2023-10-16 18:13:24,313 ---------------------------------------------------------------------------------------------------- |
|
|