stefan-it's picture
Upload folder using huggingface_hub
8b974e1
2023-10-15 02:14:00,073 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,074 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-15 02:14:00,074 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,074 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-15 02:14:00,074 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,074 Train: 14465 sentences
2023-10-15 02:14:00,074 (train_with_dev=False, train_with_test=False)
2023-10-15 02:14:00,074 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 Training Params:
2023-10-15 02:14:00,075 - learning_rate: "3e-05"
2023-10-15 02:14:00,075 - mini_batch_size: "8"
2023-10-15 02:14:00,075 - max_epochs: "10"
2023-10-15 02:14:00,075 - shuffle: "True"
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 Plugins:
2023-10-15 02:14:00,075 - LinearScheduler | warmup_fraction: '0.1'
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 Final evaluation on model from best epoch (best-model.pt)
2023-10-15 02:14:00,075 - metric: "('micro avg', 'f1-score')"
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 Computation:
2023-10-15 02:14:00,075 - compute on device: cuda:0
2023-10-15 02:14:00,075 - embedding storage: none
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 Model training base path: "hmbench-letemps/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:00,075 ----------------------------------------------------------------------------------------------------
2023-10-15 02:14:10,927 epoch 1 - iter 180/1809 - loss 1.71620325 - time (sec): 10.85 - samples/sec: 3306.55 - lr: 0.000003 - momentum: 0.000000
2023-10-15 02:14:22,349 epoch 1 - iter 360/1809 - loss 0.94066267 - time (sec): 22.27 - samples/sec: 3372.78 - lr: 0.000006 - momentum: 0.000000
2023-10-15 02:14:33,394 epoch 1 - iter 540/1809 - loss 0.69247038 - time (sec): 33.32 - samples/sec: 3362.94 - lr: 0.000009 - momentum: 0.000000
2023-10-15 02:14:44,314 epoch 1 - iter 720/1809 - loss 0.55675267 - time (sec): 44.24 - samples/sec: 3376.46 - lr: 0.000012 - momentum: 0.000000
2023-10-15 02:14:55,634 epoch 1 - iter 900/1809 - loss 0.46969736 - time (sec): 55.56 - samples/sec: 3377.19 - lr: 0.000015 - momentum: 0.000000
2023-10-15 02:15:06,901 epoch 1 - iter 1080/1809 - loss 0.41189108 - time (sec): 66.83 - samples/sec: 3355.68 - lr: 0.000018 - momentum: 0.000000
2023-10-15 02:15:18,514 epoch 1 - iter 1260/1809 - loss 0.36613576 - time (sec): 78.44 - samples/sec: 3364.39 - lr: 0.000021 - momentum: 0.000000
2023-10-15 02:15:30,173 epoch 1 - iter 1440/1809 - loss 0.33395568 - time (sec): 90.10 - samples/sec: 3354.90 - lr: 0.000024 - momentum: 0.000000
2023-10-15 02:15:41,643 epoch 1 - iter 1620/1809 - loss 0.30862678 - time (sec): 101.57 - samples/sec: 3345.11 - lr: 0.000027 - momentum: 0.000000
2023-10-15 02:15:53,267 epoch 1 - iter 1800/1809 - loss 0.28855675 - time (sec): 113.19 - samples/sec: 3342.88 - lr: 0.000030 - momentum: 0.000000
2023-10-15 02:15:53,844 ----------------------------------------------------------------------------------------------------
2023-10-15 02:15:53,844 EPOCH 1 done: loss 0.2879 - lr: 0.000030
2023-10-15 02:15:58,741 DEV : loss 0.13172951340675354 - f1-score (micro avg) 0.619
2023-10-15 02:15:58,782 saving best model
2023-10-15 02:15:59,172 ----------------------------------------------------------------------------------------------------
2023-10-15 02:16:11,817 epoch 2 - iter 180/1809 - loss 0.08947028 - time (sec): 12.64 - samples/sec: 3016.49 - lr: 0.000030 - momentum: 0.000000
2023-10-15 02:16:23,203 epoch 2 - iter 360/1809 - loss 0.08337000 - time (sec): 24.03 - samples/sec: 3124.59 - lr: 0.000029 - momentum: 0.000000
2023-10-15 02:16:34,060 epoch 2 - iter 540/1809 - loss 0.08264446 - time (sec): 34.89 - samples/sec: 3153.05 - lr: 0.000029 - momentum: 0.000000
2023-10-15 02:16:45,345 epoch 2 - iter 720/1809 - loss 0.08177670 - time (sec): 46.17 - samples/sec: 3219.93 - lr: 0.000029 - momentum: 0.000000
2023-10-15 02:16:56,562 epoch 2 - iter 900/1809 - loss 0.08349985 - time (sec): 57.39 - samples/sec: 3268.13 - lr: 0.000028 - momentum: 0.000000
2023-10-15 02:17:07,800 epoch 2 - iter 1080/1809 - loss 0.08456116 - time (sec): 68.63 - samples/sec: 3302.75 - lr: 0.000028 - momentum: 0.000000
2023-10-15 02:17:19,132 epoch 2 - iter 1260/1809 - loss 0.08295867 - time (sec): 79.96 - samples/sec: 3308.07 - lr: 0.000028 - momentum: 0.000000
2023-10-15 02:17:30,433 epoch 2 - iter 1440/1809 - loss 0.08216898 - time (sec): 91.26 - samples/sec: 3314.28 - lr: 0.000027 - momentum: 0.000000
2023-10-15 02:17:42,075 epoch 2 - iter 1620/1809 - loss 0.08149738 - time (sec): 102.90 - samples/sec: 3308.35 - lr: 0.000027 - momentum: 0.000000
2023-10-15 02:17:53,256 epoch 2 - iter 1800/1809 - loss 0.08048420 - time (sec): 114.08 - samples/sec: 3314.18 - lr: 0.000027 - momentum: 0.000000
2023-10-15 02:17:53,802 ----------------------------------------------------------------------------------------------------
2023-10-15 02:17:53,803 EPOCH 2 done: loss 0.0807 - lr: 0.000027
2023-10-15 02:17:59,385 DEV : loss 0.12593407928943634 - f1-score (micro avg) 0.6396
2023-10-15 02:17:59,416 saving best model
2023-10-15 02:17:59,915 ----------------------------------------------------------------------------------------------------
2023-10-15 02:18:11,666 epoch 3 - iter 180/1809 - loss 0.05780327 - time (sec): 11.75 - samples/sec: 3344.57 - lr: 0.000026 - momentum: 0.000000
2023-10-15 02:18:23,153 epoch 3 - iter 360/1809 - loss 0.05701266 - time (sec): 23.24 - samples/sec: 3305.08 - lr: 0.000026 - momentum: 0.000000
2023-10-15 02:18:34,765 epoch 3 - iter 540/1809 - loss 0.05674387 - time (sec): 34.85 - samples/sec: 3283.83 - lr: 0.000026 - momentum: 0.000000
2023-10-15 02:18:46,421 epoch 3 - iter 720/1809 - loss 0.05547901 - time (sec): 46.50 - samples/sec: 3296.20 - lr: 0.000025 - momentum: 0.000000
2023-10-15 02:18:59,135 epoch 3 - iter 900/1809 - loss 0.05680081 - time (sec): 59.22 - samples/sec: 3220.71 - lr: 0.000025 - momentum: 0.000000
2023-10-15 02:19:10,559 epoch 3 - iter 1080/1809 - loss 0.05816395 - time (sec): 70.64 - samples/sec: 3227.10 - lr: 0.000025 - momentum: 0.000000
2023-10-15 02:19:22,063 epoch 3 - iter 1260/1809 - loss 0.05871572 - time (sec): 82.15 - samples/sec: 3224.99 - lr: 0.000024 - momentum: 0.000000
2023-10-15 02:19:33,639 epoch 3 - iter 1440/1809 - loss 0.05780555 - time (sec): 93.72 - samples/sec: 3223.82 - lr: 0.000024 - momentum: 0.000000
2023-10-15 02:19:45,305 epoch 3 - iter 1620/1809 - loss 0.05757005 - time (sec): 105.39 - samples/sec: 3225.36 - lr: 0.000024 - momentum: 0.000000
2023-10-15 02:19:56,661 epoch 3 - iter 1800/1809 - loss 0.05749567 - time (sec): 116.74 - samples/sec: 3239.30 - lr: 0.000023 - momentum: 0.000000
2023-10-15 02:19:57,152 ----------------------------------------------------------------------------------------------------
2023-10-15 02:19:57,152 EPOCH 3 done: loss 0.0575 - lr: 0.000023
2023-10-15 02:20:02,874 DEV : loss 0.1452324539422989 - f1-score (micro avg) 0.6325
2023-10-15 02:20:02,916 ----------------------------------------------------------------------------------------------------
2023-10-15 02:20:14,468 epoch 4 - iter 180/1809 - loss 0.03496327 - time (sec): 11.55 - samples/sec: 3129.42 - lr: 0.000023 - momentum: 0.000000
2023-10-15 02:20:26,389 epoch 4 - iter 360/1809 - loss 0.04079733 - time (sec): 23.47 - samples/sec: 3147.05 - lr: 0.000023 - momentum: 0.000000
2023-10-15 02:20:38,078 epoch 4 - iter 540/1809 - loss 0.03823883 - time (sec): 35.16 - samples/sec: 3217.16 - lr: 0.000022 - momentum: 0.000000
2023-10-15 02:20:49,395 epoch 4 - iter 720/1809 - loss 0.04151284 - time (sec): 46.48 - samples/sec: 3230.29 - lr: 0.000022 - momentum: 0.000000
2023-10-15 02:21:01,210 epoch 4 - iter 900/1809 - loss 0.04052903 - time (sec): 58.29 - samples/sec: 3250.78 - lr: 0.000022 - momentum: 0.000000
2023-10-15 02:21:12,762 epoch 4 - iter 1080/1809 - loss 0.03971197 - time (sec): 69.84 - samples/sec: 3252.29 - lr: 0.000021 - momentum: 0.000000
2023-10-15 02:21:23,785 epoch 4 - iter 1260/1809 - loss 0.04066587 - time (sec): 80.87 - samples/sec: 3260.03 - lr: 0.000021 - momentum: 0.000000
2023-10-15 02:21:34,819 epoch 4 - iter 1440/1809 - loss 0.04096194 - time (sec): 91.90 - samples/sec: 3285.20 - lr: 0.000021 - momentum: 0.000000
2023-10-15 02:21:46,289 epoch 4 - iter 1620/1809 - loss 0.04138910 - time (sec): 103.37 - samples/sec: 3289.06 - lr: 0.000020 - momentum: 0.000000
2023-10-15 02:21:57,449 epoch 4 - iter 1800/1809 - loss 0.04070492 - time (sec): 114.53 - samples/sec: 3299.55 - lr: 0.000020 - momentum: 0.000000
2023-10-15 02:21:58,939 ----------------------------------------------------------------------------------------------------
2023-10-15 02:21:58,939 EPOCH 4 done: loss 0.0408 - lr: 0.000020
2023-10-15 02:22:04,577 DEV : loss 0.20669673383235931 - f1-score (micro avg) 0.6421
2023-10-15 02:22:04,608 saving best model
2023-10-15 02:22:05,114 ----------------------------------------------------------------------------------------------------
2023-10-15 02:22:16,531 epoch 5 - iter 180/1809 - loss 0.02389214 - time (sec): 11.41 - samples/sec: 3214.97 - lr: 0.000020 - momentum: 0.000000
2023-10-15 02:22:28,115 epoch 5 - iter 360/1809 - loss 0.02800805 - time (sec): 23.00 - samples/sec: 3206.97 - lr: 0.000019 - momentum: 0.000000
2023-10-15 02:22:39,847 epoch 5 - iter 540/1809 - loss 0.02827358 - time (sec): 34.73 - samples/sec: 3185.08 - lr: 0.000019 - momentum: 0.000000
2023-10-15 02:22:51,849 epoch 5 - iter 720/1809 - loss 0.02901182 - time (sec): 46.73 - samples/sec: 3194.78 - lr: 0.000019 - momentum: 0.000000
2023-10-15 02:23:03,331 epoch 5 - iter 900/1809 - loss 0.02864829 - time (sec): 58.22 - samples/sec: 3204.14 - lr: 0.000018 - momentum: 0.000000
2023-10-15 02:23:14,829 epoch 5 - iter 1080/1809 - loss 0.02941320 - time (sec): 69.71 - samples/sec: 3227.23 - lr: 0.000018 - momentum: 0.000000
2023-10-15 02:23:26,496 epoch 5 - iter 1260/1809 - loss 0.02925040 - time (sec): 81.38 - samples/sec: 3223.63 - lr: 0.000018 - momentum: 0.000000
2023-10-15 02:23:38,853 epoch 5 - iter 1440/1809 - loss 0.03024850 - time (sec): 93.74 - samples/sec: 3224.36 - lr: 0.000017 - momentum: 0.000000
2023-10-15 02:23:50,326 epoch 5 - iter 1620/1809 - loss 0.03013904 - time (sec): 105.21 - samples/sec: 3242.00 - lr: 0.000017 - momentum: 0.000000
2023-10-15 02:24:01,655 epoch 5 - iter 1800/1809 - loss 0.03031792 - time (sec): 116.54 - samples/sec: 3245.16 - lr: 0.000017 - momentum: 0.000000
2023-10-15 02:24:02,192 ----------------------------------------------------------------------------------------------------
2023-10-15 02:24:02,192 EPOCH 5 done: loss 0.0302 - lr: 0.000017
2023-10-15 02:24:09,030 DEV : loss 0.3110675513744354 - f1-score (micro avg) 0.6318
2023-10-15 02:24:09,071 ----------------------------------------------------------------------------------------------------
2023-10-15 02:24:20,379 epoch 6 - iter 180/1809 - loss 0.01816067 - time (sec): 11.31 - samples/sec: 3292.12 - lr: 0.000016 - momentum: 0.000000
2023-10-15 02:24:31,904 epoch 6 - iter 360/1809 - loss 0.02260225 - time (sec): 22.83 - samples/sec: 3258.83 - lr: 0.000016 - momentum: 0.000000
2023-10-15 02:24:43,547 epoch 6 - iter 540/1809 - loss 0.02107577 - time (sec): 34.47 - samples/sec: 3226.15 - lr: 0.000016 - momentum: 0.000000
2023-10-15 02:24:55,447 epoch 6 - iter 720/1809 - loss 0.02218085 - time (sec): 46.37 - samples/sec: 3219.97 - lr: 0.000015 - momentum: 0.000000
2023-10-15 02:25:07,299 epoch 6 - iter 900/1809 - loss 0.02197810 - time (sec): 58.23 - samples/sec: 3230.90 - lr: 0.000015 - momentum: 0.000000
2023-10-15 02:25:18,759 epoch 6 - iter 1080/1809 - loss 0.02231769 - time (sec): 69.69 - samples/sec: 3242.11 - lr: 0.000015 - momentum: 0.000000
2023-10-15 02:25:30,458 epoch 6 - iter 1260/1809 - loss 0.02186519 - time (sec): 81.39 - samples/sec: 3259.09 - lr: 0.000014 - momentum: 0.000000
2023-10-15 02:25:41,915 epoch 6 - iter 1440/1809 - loss 0.02158008 - time (sec): 92.84 - samples/sec: 3260.83 - lr: 0.000014 - momentum: 0.000000
2023-10-15 02:25:53,268 epoch 6 - iter 1620/1809 - loss 0.02185088 - time (sec): 104.20 - samples/sec: 3259.37 - lr: 0.000014 - momentum: 0.000000
2023-10-15 02:26:04,662 epoch 6 - iter 1800/1809 - loss 0.02134928 - time (sec): 115.59 - samples/sec: 3269.94 - lr: 0.000013 - momentum: 0.000000
2023-10-15 02:26:05,244 ----------------------------------------------------------------------------------------------------
2023-10-15 02:26:05,244 EPOCH 6 done: loss 0.0214 - lr: 0.000013
2023-10-15 02:26:11,783 DEV : loss 0.3325505256652832 - f1-score (micro avg) 0.6493
2023-10-15 02:26:11,813 saving best model
2023-10-15 02:26:12,331 ----------------------------------------------------------------------------------------------------
2023-10-15 02:26:23,804 epoch 7 - iter 180/1809 - loss 0.01455509 - time (sec): 11.47 - samples/sec: 3329.38 - lr: 0.000013 - momentum: 0.000000
2023-10-15 02:26:35,078 epoch 7 - iter 360/1809 - loss 0.01121576 - time (sec): 22.75 - samples/sec: 3272.00 - lr: 0.000013 - momentum: 0.000000
2023-10-15 02:26:46,891 epoch 7 - iter 540/1809 - loss 0.01362276 - time (sec): 34.56 - samples/sec: 3260.25 - lr: 0.000012 - momentum: 0.000000
2023-10-15 02:26:58,682 epoch 7 - iter 720/1809 - loss 0.01404265 - time (sec): 46.35 - samples/sec: 3245.06 - lr: 0.000012 - momentum: 0.000000
2023-10-15 02:27:09,857 epoch 7 - iter 900/1809 - loss 0.01380675 - time (sec): 57.52 - samples/sec: 3266.88 - lr: 0.000012 - momentum: 0.000000
2023-10-15 02:27:21,560 epoch 7 - iter 1080/1809 - loss 0.01462766 - time (sec): 69.23 - samples/sec: 3276.78 - lr: 0.000011 - momentum: 0.000000
2023-10-15 02:27:32,867 epoch 7 - iter 1260/1809 - loss 0.01549142 - time (sec): 80.53 - samples/sec: 3283.47 - lr: 0.000011 - momentum: 0.000000
2023-10-15 02:27:44,136 epoch 7 - iter 1440/1809 - loss 0.01554337 - time (sec): 91.80 - samples/sec: 3284.27 - lr: 0.000011 - momentum: 0.000000
2023-10-15 02:27:56,136 epoch 7 - iter 1620/1809 - loss 0.01548596 - time (sec): 103.80 - samples/sec: 3271.47 - lr: 0.000010 - momentum: 0.000000
2023-10-15 02:28:07,652 epoch 7 - iter 1800/1809 - loss 0.01536064 - time (sec): 115.32 - samples/sec: 3279.88 - lr: 0.000010 - momentum: 0.000000
2023-10-15 02:28:08,163 ----------------------------------------------------------------------------------------------------
2023-10-15 02:28:08,164 EPOCH 7 done: loss 0.0154 - lr: 0.000010
2023-10-15 02:28:15,762 DEV : loss 0.37081876397132874 - f1-score (micro avg) 0.6315
2023-10-15 02:28:15,800 ----------------------------------------------------------------------------------------------------
2023-10-15 02:28:27,190 epoch 8 - iter 180/1809 - loss 0.00878142 - time (sec): 11.39 - samples/sec: 3326.19 - lr: 0.000010 - momentum: 0.000000
2023-10-15 02:28:38,168 epoch 8 - iter 360/1809 - loss 0.00883600 - time (sec): 22.37 - samples/sec: 3309.93 - lr: 0.000009 - momentum: 0.000000
2023-10-15 02:28:49,182 epoch 8 - iter 540/1809 - loss 0.00912758 - time (sec): 33.38 - samples/sec: 3313.93 - lr: 0.000009 - momentum: 0.000000
2023-10-15 02:29:00,302 epoch 8 - iter 720/1809 - loss 0.00987622 - time (sec): 44.50 - samples/sec: 3351.66 - lr: 0.000009 - momentum: 0.000000
2023-10-15 02:29:11,250 epoch 8 - iter 900/1809 - loss 0.01048489 - time (sec): 55.45 - samples/sec: 3345.57 - lr: 0.000008 - momentum: 0.000000
2023-10-15 02:29:22,244 epoch 8 - iter 1080/1809 - loss 0.01059545 - time (sec): 66.44 - samples/sec: 3354.63 - lr: 0.000008 - momentum: 0.000000
2023-10-15 02:29:33,312 epoch 8 - iter 1260/1809 - loss 0.01035550 - time (sec): 77.51 - samples/sec: 3362.51 - lr: 0.000008 - momentum: 0.000000
2023-10-15 02:29:44,498 epoch 8 - iter 1440/1809 - loss 0.01013690 - time (sec): 88.70 - samples/sec: 3378.06 - lr: 0.000007 - momentum: 0.000000
2023-10-15 02:29:56,057 epoch 8 - iter 1620/1809 - loss 0.00979165 - time (sec): 100.26 - samples/sec: 3380.91 - lr: 0.000007 - momentum: 0.000000
2023-10-15 02:30:07,595 epoch 8 - iter 1800/1809 - loss 0.00963702 - time (sec): 111.79 - samples/sec: 3384.29 - lr: 0.000007 - momentum: 0.000000
2023-10-15 02:30:08,146 ----------------------------------------------------------------------------------------------------
2023-10-15 02:30:08,146 EPOCH 8 done: loss 0.0097 - lr: 0.000007
2023-10-15 02:30:13,762 DEV : loss 0.4017082154750824 - f1-score (micro avg) 0.6415
2023-10-15 02:30:13,803 ----------------------------------------------------------------------------------------------------
2023-10-15 02:30:25,977 epoch 9 - iter 180/1809 - loss 0.00765845 - time (sec): 12.17 - samples/sec: 3256.59 - lr: 0.000006 - momentum: 0.000000
2023-10-15 02:30:36,972 epoch 9 - iter 360/1809 - loss 0.00675346 - time (sec): 23.17 - samples/sec: 3322.71 - lr: 0.000006 - momentum: 0.000000
2023-10-15 02:30:47,836 epoch 9 - iter 540/1809 - loss 0.00672088 - time (sec): 34.03 - samples/sec: 3343.95 - lr: 0.000006 - momentum: 0.000000
2023-10-15 02:30:58,928 epoch 9 - iter 720/1809 - loss 0.00646790 - time (sec): 45.12 - samples/sec: 3399.01 - lr: 0.000005 - momentum: 0.000000
2023-10-15 02:31:09,758 epoch 9 - iter 900/1809 - loss 0.00676701 - time (sec): 55.95 - samples/sec: 3406.90 - lr: 0.000005 - momentum: 0.000000
2023-10-15 02:31:20,723 epoch 9 - iter 1080/1809 - loss 0.00669838 - time (sec): 66.92 - samples/sec: 3400.98 - lr: 0.000005 - momentum: 0.000000
2023-10-15 02:31:32,136 epoch 9 - iter 1260/1809 - loss 0.00698570 - time (sec): 78.33 - samples/sec: 3395.23 - lr: 0.000004 - momentum: 0.000000
2023-10-15 02:31:43,046 epoch 9 - iter 1440/1809 - loss 0.00706185 - time (sec): 89.24 - samples/sec: 3404.64 - lr: 0.000004 - momentum: 0.000000
2023-10-15 02:31:54,348 epoch 9 - iter 1620/1809 - loss 0.00720346 - time (sec): 100.54 - samples/sec: 3398.13 - lr: 0.000004 - momentum: 0.000000
2023-10-15 02:32:05,269 epoch 9 - iter 1800/1809 - loss 0.00700774 - time (sec): 111.46 - samples/sec: 3392.28 - lr: 0.000003 - momentum: 0.000000
2023-10-15 02:32:05,812 ----------------------------------------------------------------------------------------------------
2023-10-15 02:32:05,812 EPOCH 9 done: loss 0.0070 - lr: 0.000003
2023-10-15 02:32:11,497 DEV : loss 0.4086902439594269 - f1-score (micro avg) 0.6492
2023-10-15 02:32:11,544 ----------------------------------------------------------------------------------------------------
2023-10-15 02:32:23,632 epoch 10 - iter 180/1809 - loss 0.00472486 - time (sec): 12.09 - samples/sec: 3177.79 - lr: 0.000003 - momentum: 0.000000
2023-10-15 02:32:35,272 epoch 10 - iter 360/1809 - loss 0.00533860 - time (sec): 23.73 - samples/sec: 3171.01 - lr: 0.000003 - momentum: 0.000000
2023-10-15 02:32:46,964 epoch 10 - iter 540/1809 - loss 0.00499615 - time (sec): 35.42 - samples/sec: 3218.25 - lr: 0.000002 - momentum: 0.000000
2023-10-15 02:32:59,939 epoch 10 - iter 720/1809 - loss 0.00489132 - time (sec): 48.39 - samples/sec: 3122.81 - lr: 0.000002 - momentum: 0.000000
2023-10-15 02:33:11,964 epoch 10 - iter 900/1809 - loss 0.00443154 - time (sec): 60.42 - samples/sec: 3144.19 - lr: 0.000002 - momentum: 0.000000
2023-10-15 02:33:23,384 epoch 10 - iter 1080/1809 - loss 0.00446488 - time (sec): 71.84 - samples/sec: 3161.77 - lr: 0.000001 - momentum: 0.000000
2023-10-15 02:33:34,855 epoch 10 - iter 1260/1809 - loss 0.00421573 - time (sec): 83.31 - samples/sec: 3181.83 - lr: 0.000001 - momentum: 0.000000
2023-10-15 02:33:46,612 epoch 10 - iter 1440/1809 - loss 0.00414519 - time (sec): 95.07 - samples/sec: 3175.58 - lr: 0.000001 - momentum: 0.000000
2023-10-15 02:33:58,058 epoch 10 - iter 1620/1809 - loss 0.00475133 - time (sec): 106.51 - samples/sec: 3196.13 - lr: 0.000000 - momentum: 0.000000
2023-10-15 02:34:09,631 epoch 10 - iter 1800/1809 - loss 0.00456537 - time (sec): 118.08 - samples/sec: 3203.33 - lr: 0.000000 - momentum: 0.000000
2023-10-15 02:34:10,184 ----------------------------------------------------------------------------------------------------
2023-10-15 02:34:10,184 EPOCH 10 done: loss 0.0046 - lr: 0.000000
2023-10-15 02:34:15,872 DEV : loss 0.4162753224372864 - f1-score (micro avg) 0.6466
2023-10-15 02:34:16,294 ----------------------------------------------------------------------------------------------------
2023-10-15 02:34:16,295 Loading model from best epoch ...
2023-10-15 02:34:17,937 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-15 02:34:25,749
Results:
- F-score (micro) 0.6541
- F-score (macro) 0.51
- Accuracy 0.5007
By class:
precision recall f1-score support
loc 0.6394 0.7800 0.7027 591
pers 0.5851 0.7703 0.6651 357
org 0.1739 0.1519 0.1622 79
micro avg 0.5937 0.7283 0.6541 1027
macro avg 0.4661 0.5674 0.5100 1027
weighted avg 0.5847 0.7283 0.6481 1027
2023-10-15 02:34:25,749 ----------------------------------------------------------------------------------------------------