|
2023-10-16 12:45:52,143 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,144 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 12:45:52,144 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Train: 7142 sentences |
|
2023-10-16 12:45:52,145 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Training Params: |
|
2023-10-16 12:45:52,145 - learning_rate: "3e-05" |
|
2023-10-16 12:45:52,145 - mini_batch_size: "8" |
|
2023-10-16 12:45:52,145 - max_epochs: "10" |
|
2023-10-16 12:45:52,145 - shuffle: "True" |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Plugins: |
|
2023-10-16 12:45:52,145 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 12:45:52,145 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Computation: |
|
2023-10-16 12:45:52,145 - compute on device: cuda:0 |
|
2023-10-16 12:45:52,145 - embedding storage: none |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:45:52,145 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:46:00,431 epoch 1 - iter 89/893 - loss 2.55511339 - time (sec): 8.28 - samples/sec: 2909.27 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 12:46:07,327 epoch 1 - iter 178/893 - loss 1.61235074 - time (sec): 15.18 - samples/sec: 3224.15 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 12:46:14,109 epoch 1 - iter 267/893 - loss 1.21408701 - time (sec): 21.96 - samples/sec: 3363.25 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 12:46:21,076 epoch 1 - iter 356/893 - loss 0.98196317 - time (sec): 28.93 - samples/sec: 3425.09 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 12:46:27,573 epoch 1 - iter 445/893 - loss 0.84813604 - time (sec): 35.43 - samples/sec: 3461.42 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 12:46:34,458 epoch 1 - iter 534/893 - loss 0.73801609 - time (sec): 42.31 - samples/sec: 3505.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 12:46:41,293 epoch 1 - iter 623/893 - loss 0.65814262 - time (sec): 49.15 - samples/sec: 3539.49 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 12:46:48,032 epoch 1 - iter 712/893 - loss 0.60020660 - time (sec): 55.89 - samples/sec: 3546.66 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 12:46:54,998 epoch 1 - iter 801/893 - loss 0.55223454 - time (sec): 62.85 - samples/sec: 3546.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 12:47:01,870 epoch 1 - iter 890/893 - loss 0.51136141 - time (sec): 69.72 - samples/sec: 3557.52 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 12:47:02,083 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:47:02,083 EPOCH 1 done: loss 0.5100 - lr: 0.000030 |
|
2023-10-16 12:47:04,728 DEV : loss 0.11952730268239975 - f1-score (micro avg) 0.6946 |
|
2023-10-16 12:47:04,745 saving best model |
|
2023-10-16 12:47:05,151 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:47:12,143 epoch 2 - iter 89/893 - loss 0.11614387 - time (sec): 6.99 - samples/sec: 3765.39 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 12:47:18,713 epoch 2 - iter 178/893 - loss 0.11703984 - time (sec): 13.56 - samples/sec: 3705.32 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 12:47:25,911 epoch 2 - iter 267/893 - loss 0.11593650 - time (sec): 20.76 - samples/sec: 3665.76 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 12:47:33,003 epoch 2 - iter 356/893 - loss 0.11898584 - time (sec): 27.85 - samples/sec: 3575.78 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 12:47:40,167 epoch 2 - iter 445/893 - loss 0.11380050 - time (sec): 35.01 - samples/sec: 3595.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 12:47:46,495 epoch 2 - iter 534/893 - loss 0.11094488 - time (sec): 41.34 - samples/sec: 3622.50 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 12:47:53,145 epoch 2 - iter 623/893 - loss 0.11041525 - time (sec): 47.99 - samples/sec: 3621.92 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 12:47:59,812 epoch 2 - iter 712/893 - loss 0.10964599 - time (sec): 54.66 - samples/sec: 3624.04 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 12:48:06,674 epoch 2 - iter 801/893 - loss 0.10815829 - time (sec): 61.52 - samples/sec: 3626.54 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 12:48:13,580 epoch 2 - iter 890/893 - loss 0.10593895 - time (sec): 68.43 - samples/sec: 3619.64 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 12:48:13,861 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:48:13,861 EPOCH 2 done: loss 0.1057 - lr: 0.000027 |
|
2023-10-16 12:48:18,043 DEV : loss 0.10355333238840103 - f1-score (micro avg) 0.7514 |
|
2023-10-16 12:48:18,059 saving best model |
|
2023-10-16 12:48:18,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:48:25,574 epoch 3 - iter 89/893 - loss 0.06161232 - time (sec): 6.96 - samples/sec: 3465.77 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 12:48:33,414 epoch 3 - iter 178/893 - loss 0.05931401 - time (sec): 14.80 - samples/sec: 3535.00 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 12:48:40,156 epoch 3 - iter 267/893 - loss 0.06373319 - time (sec): 21.54 - samples/sec: 3554.90 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 12:48:46,889 epoch 3 - iter 356/893 - loss 0.06176266 - time (sec): 28.27 - samples/sec: 3583.15 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 12:48:53,244 epoch 3 - iter 445/893 - loss 0.06271250 - time (sec): 34.63 - samples/sec: 3585.52 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 12:48:59,780 epoch 3 - iter 534/893 - loss 0.06428874 - time (sec): 41.16 - samples/sec: 3599.68 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 12:49:06,534 epoch 3 - iter 623/893 - loss 0.06294223 - time (sec): 47.92 - samples/sec: 3638.07 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 12:49:12,998 epoch 3 - iter 712/893 - loss 0.06425486 - time (sec): 54.38 - samples/sec: 3652.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 12:49:19,311 epoch 3 - iter 801/893 - loss 0.06474582 - time (sec): 60.70 - samples/sec: 3655.96 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 12:49:26,634 epoch 3 - iter 890/893 - loss 0.06378154 - time (sec): 68.02 - samples/sec: 3646.66 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 12:49:26,907 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:49:26,907 EPOCH 3 done: loss 0.0640 - lr: 0.000023 |
|
2023-10-16 12:49:31,625 DEV : loss 0.12550972402095795 - f1-score (micro avg) 0.7835 |
|
2023-10-16 12:49:31,640 saving best model |
|
2023-10-16 12:49:32,212 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:49:39,003 epoch 4 - iter 89/893 - loss 0.04612929 - time (sec): 6.79 - samples/sec: 3721.21 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 12:49:45,662 epoch 4 - iter 178/893 - loss 0.04670224 - time (sec): 13.45 - samples/sec: 3748.05 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 12:49:52,483 epoch 4 - iter 267/893 - loss 0.04673840 - time (sec): 20.27 - samples/sec: 3689.22 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 12:49:59,270 epoch 4 - iter 356/893 - loss 0.04627007 - time (sec): 27.06 - samples/sec: 3678.67 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 12:50:06,421 epoch 4 - iter 445/893 - loss 0.04530410 - time (sec): 34.21 - samples/sec: 3675.49 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 12:50:13,138 epoch 4 - iter 534/893 - loss 0.04603225 - time (sec): 40.92 - samples/sec: 3680.66 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 12:50:19,943 epoch 4 - iter 623/893 - loss 0.04600487 - time (sec): 47.73 - samples/sec: 3649.96 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 12:50:26,861 epoch 4 - iter 712/893 - loss 0.04696416 - time (sec): 54.65 - samples/sec: 3624.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 12:50:33,465 epoch 4 - iter 801/893 - loss 0.04647850 - time (sec): 61.25 - samples/sec: 3635.01 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 12:50:40,628 epoch 4 - iter 890/893 - loss 0.04696665 - time (sec): 68.41 - samples/sec: 3626.05 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 12:50:40,819 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:50:40,819 EPOCH 4 done: loss 0.0469 - lr: 0.000020 |
|
2023-10-16 12:50:45,113 DEV : loss 0.14134925603866577 - f1-score (micro avg) 0.7836 |
|
2023-10-16 12:50:45,130 saving best model |
|
2023-10-16 12:50:45,700 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:50:52,682 epoch 5 - iter 89/893 - loss 0.02860295 - time (sec): 6.98 - samples/sec: 3752.60 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 12:50:59,524 epoch 5 - iter 178/893 - loss 0.03232203 - time (sec): 13.82 - samples/sec: 3696.27 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 12:51:06,414 epoch 5 - iter 267/893 - loss 0.03327893 - time (sec): 20.71 - samples/sec: 3549.51 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 12:51:13,137 epoch 5 - iter 356/893 - loss 0.03240572 - time (sec): 27.44 - samples/sec: 3555.85 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 12:51:20,796 epoch 5 - iter 445/893 - loss 0.03319621 - time (sec): 35.09 - samples/sec: 3560.42 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 12:51:27,811 epoch 5 - iter 534/893 - loss 0.03409730 - time (sec): 42.11 - samples/sec: 3572.12 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 12:51:34,927 epoch 5 - iter 623/893 - loss 0.03454180 - time (sec): 49.23 - samples/sec: 3592.93 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 12:51:41,334 epoch 5 - iter 712/893 - loss 0.03417524 - time (sec): 55.63 - samples/sec: 3589.84 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 12:51:48,137 epoch 5 - iter 801/893 - loss 0.03472578 - time (sec): 62.44 - samples/sec: 3578.49 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 12:51:54,801 epoch 5 - iter 890/893 - loss 0.03490357 - time (sec): 69.10 - samples/sec: 3588.45 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 12:51:55,044 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:51:55,044 EPOCH 5 done: loss 0.0348 - lr: 0.000017 |
|
2023-10-16 12:51:59,372 DEV : loss 0.17483924329280853 - f1-score (micro avg) 0.7945 |
|
2023-10-16 12:51:59,392 saving best model |
|
2023-10-16 12:51:59,954 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:52:06,819 epoch 6 - iter 89/893 - loss 0.02626343 - time (sec): 6.86 - samples/sec: 3670.10 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 12:52:13,623 epoch 6 - iter 178/893 - loss 0.03047495 - time (sec): 13.67 - samples/sec: 3610.14 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 12:52:20,194 epoch 6 - iter 267/893 - loss 0.03214194 - time (sec): 20.24 - samples/sec: 3673.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 12:52:27,056 epoch 6 - iter 356/893 - loss 0.03185153 - time (sec): 27.10 - samples/sec: 3671.57 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 12:52:34,130 epoch 6 - iter 445/893 - loss 0.02932483 - time (sec): 34.17 - samples/sec: 3672.16 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 12:52:40,891 epoch 6 - iter 534/893 - loss 0.02993883 - time (sec): 40.93 - samples/sec: 3659.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 12:52:47,512 epoch 6 - iter 623/893 - loss 0.02884411 - time (sec): 47.56 - samples/sec: 3683.06 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 12:52:54,195 epoch 6 - iter 712/893 - loss 0.02850536 - time (sec): 54.24 - samples/sec: 3668.78 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 12:53:01,056 epoch 6 - iter 801/893 - loss 0.02830873 - time (sec): 61.10 - samples/sec: 3657.82 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 12:53:08,041 epoch 6 - iter 890/893 - loss 0.02857723 - time (sec): 68.08 - samples/sec: 3645.85 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 12:53:08,211 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:53:08,212 EPOCH 6 done: loss 0.0287 - lr: 0.000013 |
|
2023-10-16 12:53:12,859 DEV : loss 0.19940702617168427 - f1-score (micro avg) 0.7727 |
|
2023-10-16 12:53:12,875 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:53:19,596 epoch 7 - iter 89/893 - loss 0.02500460 - time (sec): 6.72 - samples/sec: 3541.40 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 12:53:26,259 epoch 7 - iter 178/893 - loss 0.02277466 - time (sec): 13.38 - samples/sec: 3591.82 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 12:53:33,085 epoch 7 - iter 267/893 - loss 0.02090527 - time (sec): 20.21 - samples/sec: 3616.67 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 12:53:39,862 epoch 7 - iter 356/893 - loss 0.02011475 - time (sec): 26.99 - samples/sec: 3636.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 12:53:46,751 epoch 7 - iter 445/893 - loss 0.02069675 - time (sec): 33.88 - samples/sec: 3629.54 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 12:53:53,473 epoch 7 - iter 534/893 - loss 0.02079766 - time (sec): 40.60 - samples/sec: 3626.98 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 12:54:00,256 epoch 7 - iter 623/893 - loss 0.02128472 - time (sec): 47.38 - samples/sec: 3641.22 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 12:54:07,283 epoch 7 - iter 712/893 - loss 0.02254890 - time (sec): 54.41 - samples/sec: 3648.20 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 12:54:13,922 epoch 7 - iter 801/893 - loss 0.02269062 - time (sec): 61.05 - samples/sec: 3639.12 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 12:54:20,953 epoch 7 - iter 890/893 - loss 0.02194304 - time (sec): 68.08 - samples/sec: 3640.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 12:54:21,171 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:54:21,171 EPOCH 7 done: loss 0.0220 - lr: 0.000010 |
|
2023-10-16 12:54:25,732 DEV : loss 0.19999033212661743 - f1-score (micro avg) 0.7912 |
|
2023-10-16 12:54:25,755 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:54:32,512 epoch 8 - iter 89/893 - loss 0.01168160 - time (sec): 6.76 - samples/sec: 3588.80 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 12:54:39,559 epoch 8 - iter 178/893 - loss 0.01427644 - time (sec): 13.80 - samples/sec: 3476.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 12:54:46,333 epoch 8 - iter 267/893 - loss 0.01591158 - time (sec): 20.58 - samples/sec: 3543.59 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 12:54:53,376 epoch 8 - iter 356/893 - loss 0.01732768 - time (sec): 27.62 - samples/sec: 3605.76 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 12:55:00,019 epoch 8 - iter 445/893 - loss 0.01967539 - time (sec): 34.26 - samples/sec: 3629.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 12:55:07,011 epoch 8 - iter 534/893 - loss 0.01853002 - time (sec): 41.25 - samples/sec: 3637.34 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 12:55:13,914 epoch 8 - iter 623/893 - loss 0.01808121 - time (sec): 48.16 - samples/sec: 3620.58 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 12:55:20,357 epoch 8 - iter 712/893 - loss 0.01781687 - time (sec): 54.60 - samples/sec: 3625.12 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 12:55:27,362 epoch 8 - iter 801/893 - loss 0.01794196 - time (sec): 61.61 - samples/sec: 3612.50 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 12:55:34,282 epoch 8 - iter 890/893 - loss 0.01754005 - time (sec): 68.53 - samples/sec: 3621.46 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 12:55:34,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:55:34,489 EPOCH 8 done: loss 0.0175 - lr: 0.000007 |
|
2023-10-16 12:55:38,684 DEV : loss 0.20922400057315826 - f1-score (micro avg) 0.8038 |
|
2023-10-16 12:55:38,700 saving best model |
|
2023-10-16 12:55:39,186 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:55:45,857 epoch 9 - iter 89/893 - loss 0.00982904 - time (sec): 6.67 - samples/sec: 3621.30 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 12:55:52,816 epoch 9 - iter 178/893 - loss 0.01033872 - time (sec): 13.63 - samples/sec: 3702.90 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 12:56:00,091 epoch 9 - iter 267/893 - loss 0.01214149 - time (sec): 20.90 - samples/sec: 3649.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 12:56:06,660 epoch 9 - iter 356/893 - loss 0.01195331 - time (sec): 27.47 - samples/sec: 3654.09 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 12:56:13,921 epoch 9 - iter 445/893 - loss 0.01203697 - time (sec): 34.73 - samples/sec: 3632.41 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 12:56:20,404 epoch 9 - iter 534/893 - loss 0.01189900 - time (sec): 41.22 - samples/sec: 3644.72 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 12:56:27,229 epoch 9 - iter 623/893 - loss 0.01250389 - time (sec): 48.04 - samples/sec: 3649.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 12:56:34,221 epoch 9 - iter 712/893 - loss 0.01230678 - time (sec): 55.03 - samples/sec: 3640.52 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 12:56:40,816 epoch 9 - iter 801/893 - loss 0.01224281 - time (sec): 61.63 - samples/sec: 3637.30 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 12:56:47,523 epoch 9 - iter 890/893 - loss 0.01274520 - time (sec): 68.34 - samples/sec: 3631.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 12:56:47,696 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:56:47,696 EPOCH 9 done: loss 0.0128 - lr: 0.000003 |
|
2023-10-16 12:56:52,288 DEV : loss 0.21604780852794647 - f1-score (micro avg) 0.7879 |
|
2023-10-16 12:56:52,314 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:56:59,282 epoch 10 - iter 89/893 - loss 0.00670445 - time (sec): 6.97 - samples/sec: 3448.15 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 12:57:05,865 epoch 10 - iter 178/893 - loss 0.00793855 - time (sec): 13.55 - samples/sec: 3566.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 12:57:12,745 epoch 10 - iter 267/893 - loss 0.00788264 - time (sec): 20.43 - samples/sec: 3594.85 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 12:57:19,300 epoch 10 - iter 356/893 - loss 0.00725310 - time (sec): 26.98 - samples/sec: 3573.51 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 12:57:26,244 epoch 10 - iter 445/893 - loss 0.00808299 - time (sec): 33.93 - samples/sec: 3599.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 12:57:32,839 epoch 10 - iter 534/893 - loss 0.00893934 - time (sec): 40.52 - samples/sec: 3616.94 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 12:57:39,669 epoch 10 - iter 623/893 - loss 0.00976410 - time (sec): 47.35 - samples/sec: 3625.68 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 12:57:46,950 epoch 10 - iter 712/893 - loss 0.01000539 - time (sec): 54.63 - samples/sec: 3628.40 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 12:57:54,074 epoch 10 - iter 801/893 - loss 0.00983205 - time (sec): 61.76 - samples/sec: 3622.71 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 12:58:00,863 epoch 10 - iter 890/893 - loss 0.00969370 - time (sec): 68.55 - samples/sec: 3621.54 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 12:58:01,015 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:58:01,015 EPOCH 10 done: loss 0.0097 - lr: 0.000000 |
|
2023-10-16 12:58:05,669 DEV : loss 0.2217082679271698 - f1-score (micro avg) 0.7952 |
|
2023-10-16 12:58:06,080 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 12:58:06,081 Loading model from best epoch ... |
|
2023-10-16 12:58:07,923 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 12:58:17,287 |
|
Results: |
|
- F-score (micro) 0.6906 |
|
- F-score (macro) 0.6137 |
|
- Accuracy 0.5455 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7176 0.6868 0.7018 1095 |
|
PER 0.7696 0.7658 0.7677 1012 |
|
ORG 0.4312 0.5882 0.4976 357 |
|
HumanProd 0.4082 0.6061 0.4878 33 |
|
|
|
micro avg 0.6781 0.7036 0.6906 2497 |
|
macro avg 0.5816 0.6617 0.6137 2497 |
|
weighted avg 0.6936 0.7036 0.6965 2497 |
|
|
|
2023-10-16 12:58:17,287 ---------------------------------------------------------------------------------------------------- |
|
|