stefan-it's picture
Upload folder using huggingface_hub
5c72760
2023-10-19 19:42:10,225 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,226 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 19:42:10,226 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,226 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-19 19:42:10,226 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,226 Train: 7142 sentences
2023-10-19 19:42:10,226 (train_with_dev=False, train_with_test=False)
2023-10-19 19:42:10,226 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,226 Training Params:
2023-10-19 19:42:10,226 - learning_rate: "3e-05"
2023-10-19 19:42:10,226 - mini_batch_size: "8"
2023-10-19 19:42:10,226 - max_epochs: "10"
2023-10-19 19:42:10,226 - shuffle: "True"
2023-10-19 19:42:10,226 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,226 Plugins:
2023-10-19 19:42:10,226 - TensorboardLogger
2023-10-19 19:42:10,226 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 19:42:10,227 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,227 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 19:42:10,227 - metric: "('micro avg', 'f1-score')"
2023-10-19 19:42:10,227 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,227 Computation:
2023-10-19 19:42:10,227 - compute on device: cuda:0
2023-10-19 19:42:10,227 - embedding storage: none
2023-10-19 19:42:10,227 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,227 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-19 19:42:10,227 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,227 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:10,227 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 19:42:12,499 epoch 1 - iter 89/893 - loss 3.42294852 - time (sec): 2.27 - samples/sec: 11786.11 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:42:14,750 epoch 1 - iter 178/893 - loss 3.24973616 - time (sec): 4.52 - samples/sec: 11380.29 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:42:17,036 epoch 1 - iter 267/893 - loss 2.92472187 - time (sec): 6.81 - samples/sec: 11321.04 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:42:19,445 epoch 1 - iter 356/893 - loss 2.58407417 - time (sec): 9.22 - samples/sec: 10852.87 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:42:21,815 epoch 1 - iter 445/893 - loss 2.24270513 - time (sec): 11.59 - samples/sec: 10780.32 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:42:24,038 epoch 1 - iter 534/893 - loss 1.99303527 - time (sec): 13.81 - samples/sec: 10815.88 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:42:26,284 epoch 1 - iter 623/893 - loss 1.81063405 - time (sec): 16.06 - samples/sec: 10852.24 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:42:28,688 epoch 1 - iter 712/893 - loss 1.66666224 - time (sec): 18.46 - samples/sec: 10870.25 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:42:30,952 epoch 1 - iter 801/893 - loss 1.55222766 - time (sec): 20.72 - samples/sec: 10924.19 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:42:33,153 epoch 1 - iter 890/893 - loss 1.46483452 - time (sec): 22.93 - samples/sec: 10827.74 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:42:33,225 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:33,225 EPOCH 1 done: loss 1.4633 - lr: 0.000030
2023-10-19 19:42:34,670 DEV : loss 0.37004899978637695 - f1-score (micro avg) 0.0
2023-10-19 19:42:34,685 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:36,937 epoch 2 - iter 89/893 - loss 0.56991310 - time (sec): 2.25 - samples/sec: 10827.79 - lr: 0.000030 - momentum: 0.000000
2023-10-19 19:42:39,232 epoch 2 - iter 178/893 - loss 0.51829563 - time (sec): 4.55 - samples/sec: 10988.57 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:42:41,570 epoch 2 - iter 267/893 - loss 0.51918911 - time (sec): 6.88 - samples/sec: 10943.58 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:42:43,885 epoch 2 - iter 356/893 - loss 0.50708102 - time (sec): 9.20 - samples/sec: 10975.86 - lr: 0.000029 - momentum: 0.000000
2023-10-19 19:42:46,150 epoch 2 - iter 445/893 - loss 0.49813056 - time (sec): 11.47 - samples/sec: 10761.12 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:42:48,412 epoch 2 - iter 534/893 - loss 0.48552781 - time (sec): 13.73 - samples/sec: 10827.25 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:42:50,698 epoch 2 - iter 623/893 - loss 0.48075918 - time (sec): 16.01 - samples/sec: 10836.03 - lr: 0.000028 - momentum: 0.000000
2023-10-19 19:42:52,988 epoch 2 - iter 712/893 - loss 0.48102198 - time (sec): 18.30 - samples/sec: 10908.04 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:42:55,169 epoch 2 - iter 801/893 - loss 0.47517621 - time (sec): 20.48 - samples/sec: 10910.74 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:42:57,403 epoch 2 - iter 890/893 - loss 0.46873055 - time (sec): 22.72 - samples/sec: 10918.99 - lr: 0.000027 - momentum: 0.000000
2023-10-19 19:42:57,476 ----------------------------------------------------------------------------------------------------
2023-10-19 19:42:57,476 EPOCH 2 done: loss 0.4686 - lr: 0.000027
2023-10-19 19:42:59,798 DEV : loss 0.27531182765960693 - f1-score (micro avg) 0.271
2023-10-19 19:42:59,812 saving best model
2023-10-19 19:42:59,842 ----------------------------------------------------------------------------------------------------
2023-10-19 19:43:02,214 epoch 3 - iter 89/893 - loss 0.38596650 - time (sec): 2.37 - samples/sec: 9927.69 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:43:04,319 epoch 3 - iter 178/893 - loss 0.39477050 - time (sec): 4.48 - samples/sec: 10631.79 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:43:06,573 epoch 3 - iter 267/893 - loss 0.40926338 - time (sec): 6.73 - samples/sec: 10676.77 - lr: 0.000026 - momentum: 0.000000
2023-10-19 19:43:08,797 epoch 3 - iter 356/893 - loss 0.39975922 - time (sec): 8.95 - samples/sec: 10767.85 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:43:11,077 epoch 3 - iter 445/893 - loss 0.39903730 - time (sec): 11.23 - samples/sec: 10745.31 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:43:13,421 epoch 3 - iter 534/893 - loss 0.39669463 - time (sec): 13.58 - samples/sec: 10844.31 - lr: 0.000025 - momentum: 0.000000
2023-10-19 19:43:15,674 epoch 3 - iter 623/893 - loss 0.39311498 - time (sec): 15.83 - samples/sec: 10812.31 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:43:17,971 epoch 3 - iter 712/893 - loss 0.39258502 - time (sec): 18.13 - samples/sec: 10856.97 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:43:20,300 epoch 3 - iter 801/893 - loss 0.38428342 - time (sec): 20.46 - samples/sec: 10923.44 - lr: 0.000024 - momentum: 0.000000
2023-10-19 19:43:22,576 epoch 3 - iter 890/893 - loss 0.38254796 - time (sec): 22.73 - samples/sec: 10905.73 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:43:22,657 ----------------------------------------------------------------------------------------------------
2023-10-19 19:43:22,657 EPOCH 3 done: loss 0.3827 - lr: 0.000023
2023-10-19 19:43:25,008 DEV : loss 0.24437254667282104 - f1-score (micro avg) 0.3806
2023-10-19 19:43:25,023 saving best model
2023-10-19 19:43:25,061 ----------------------------------------------------------------------------------------------------
2023-10-19 19:43:27,203 epoch 4 - iter 89/893 - loss 0.38266866 - time (sec): 2.14 - samples/sec: 10963.05 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:43:29,424 epoch 4 - iter 178/893 - loss 0.36491384 - time (sec): 4.36 - samples/sec: 11065.18 - lr: 0.000023 - momentum: 0.000000
2023-10-19 19:43:31,745 epoch 4 - iter 267/893 - loss 0.36615042 - time (sec): 6.68 - samples/sec: 11134.67 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:43:34,104 epoch 4 - iter 356/893 - loss 0.36528656 - time (sec): 9.04 - samples/sec: 10703.86 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:43:36,429 epoch 4 - iter 445/893 - loss 0.35886848 - time (sec): 11.37 - samples/sec: 10649.20 - lr: 0.000022 - momentum: 0.000000
2023-10-19 19:43:38,724 epoch 4 - iter 534/893 - loss 0.35993741 - time (sec): 13.66 - samples/sec: 10697.14 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:43:41,003 epoch 4 - iter 623/893 - loss 0.35205696 - time (sec): 15.94 - samples/sec: 10784.30 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:43:43,250 epoch 4 - iter 712/893 - loss 0.35119897 - time (sec): 18.19 - samples/sec: 10840.13 - lr: 0.000021 - momentum: 0.000000
2023-10-19 19:43:45,525 epoch 4 - iter 801/893 - loss 0.34686215 - time (sec): 20.46 - samples/sec: 10881.75 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:43:47,797 epoch 4 - iter 890/893 - loss 0.34408543 - time (sec): 22.74 - samples/sec: 10913.42 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:43:47,875 ----------------------------------------------------------------------------------------------------
2023-10-19 19:43:47,875 EPOCH 4 done: loss 0.3444 - lr: 0.000020
2023-10-19 19:43:50,697 DEV : loss 0.22855891287326813 - f1-score (micro avg) 0.4203
2023-10-19 19:43:50,711 saving best model
2023-10-19 19:43:50,743 ----------------------------------------------------------------------------------------------------
2023-10-19 19:43:52,824 epoch 5 - iter 89/893 - loss 0.31466381 - time (sec): 2.08 - samples/sec: 12234.69 - lr: 0.000020 - momentum: 0.000000
2023-10-19 19:43:54,783 epoch 5 - iter 178/893 - loss 0.33259770 - time (sec): 4.04 - samples/sec: 12497.44 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:43:56,784 epoch 5 - iter 267/893 - loss 0.33327894 - time (sec): 6.04 - samples/sec: 12653.96 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:43:58,841 epoch 5 - iter 356/893 - loss 0.33596139 - time (sec): 8.10 - samples/sec: 12185.75 - lr: 0.000019 - momentum: 0.000000
2023-10-19 19:44:01,139 epoch 5 - iter 445/893 - loss 0.33259103 - time (sec): 10.39 - samples/sec: 11893.74 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:44:03,346 epoch 5 - iter 534/893 - loss 0.32763605 - time (sec): 12.60 - samples/sec: 11795.05 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:44:05,585 epoch 5 - iter 623/893 - loss 0.32488129 - time (sec): 14.84 - samples/sec: 11686.67 - lr: 0.000018 - momentum: 0.000000
2023-10-19 19:44:07,797 epoch 5 - iter 712/893 - loss 0.32110686 - time (sec): 17.05 - samples/sec: 11715.43 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:44:10,044 epoch 5 - iter 801/893 - loss 0.32002079 - time (sec): 19.30 - samples/sec: 11627.50 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:44:12,284 epoch 5 - iter 890/893 - loss 0.31595157 - time (sec): 21.54 - samples/sec: 11524.77 - lr: 0.000017 - momentum: 0.000000
2023-10-19 19:44:12,355 ----------------------------------------------------------------------------------------------------
2023-10-19 19:44:12,356 EPOCH 5 done: loss 0.3164 - lr: 0.000017
2023-10-19 19:44:14,717 DEV : loss 0.21935363113880157 - f1-score (micro avg) 0.4215
2023-10-19 19:44:14,733 saving best model
2023-10-19 19:44:14,768 ----------------------------------------------------------------------------------------------------
2023-10-19 19:44:17,085 epoch 6 - iter 89/893 - loss 0.29676919 - time (sec): 2.32 - samples/sec: 10650.34 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:44:19,366 epoch 6 - iter 178/893 - loss 0.28651342 - time (sec): 4.60 - samples/sec: 10997.91 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:44:21,619 epoch 6 - iter 267/893 - loss 0.29160014 - time (sec): 6.85 - samples/sec: 11104.38 - lr: 0.000016 - momentum: 0.000000
2023-10-19 19:44:24,327 epoch 6 - iter 356/893 - loss 0.29690307 - time (sec): 9.56 - samples/sec: 10517.72 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:44:26,566 epoch 6 - iter 445/893 - loss 0.29916859 - time (sec): 11.80 - samples/sec: 10723.48 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:44:28,793 epoch 6 - iter 534/893 - loss 0.29847625 - time (sec): 14.02 - samples/sec: 10754.82 - lr: 0.000015 - momentum: 0.000000
2023-10-19 19:44:31,010 epoch 6 - iter 623/893 - loss 0.29921952 - time (sec): 16.24 - samples/sec: 10726.73 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:44:33,255 epoch 6 - iter 712/893 - loss 0.29877886 - time (sec): 18.49 - samples/sec: 10749.20 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:44:35,494 epoch 6 - iter 801/893 - loss 0.29509483 - time (sec): 20.73 - samples/sec: 10769.45 - lr: 0.000014 - momentum: 0.000000
2023-10-19 19:44:37,767 epoch 6 - iter 890/893 - loss 0.29708723 - time (sec): 23.00 - samples/sec: 10793.38 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:44:37,841 ----------------------------------------------------------------------------------------------------
2023-10-19 19:44:37,841 EPOCH 6 done: loss 0.2976 - lr: 0.000013
2023-10-19 19:44:40,209 DEV : loss 0.21392673254013062 - f1-score (micro avg) 0.4466
2023-10-19 19:44:40,224 saving best model
2023-10-19 19:44:40,259 ----------------------------------------------------------------------------------------------------
2023-10-19 19:44:42,366 epoch 7 - iter 89/893 - loss 0.28381710 - time (sec): 2.11 - samples/sec: 10978.42 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:44:44,619 epoch 7 - iter 178/893 - loss 0.28769399 - time (sec): 4.36 - samples/sec: 11038.25 - lr: 0.000013 - momentum: 0.000000
2023-10-19 19:44:46,970 epoch 7 - iter 267/893 - loss 0.27591167 - time (sec): 6.71 - samples/sec: 10940.40 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:44:49,224 epoch 7 - iter 356/893 - loss 0.28249607 - time (sec): 8.96 - samples/sec: 10912.64 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:44:51,474 epoch 7 - iter 445/893 - loss 0.28466695 - time (sec): 11.21 - samples/sec: 10979.50 - lr: 0.000012 - momentum: 0.000000
2023-10-19 19:44:53,826 epoch 7 - iter 534/893 - loss 0.28087036 - time (sec): 13.57 - samples/sec: 10976.81 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:44:56,168 epoch 7 - iter 623/893 - loss 0.28248157 - time (sec): 15.91 - samples/sec: 10935.88 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:44:58,432 epoch 7 - iter 712/893 - loss 0.28429254 - time (sec): 18.17 - samples/sec: 10891.84 - lr: 0.000011 - momentum: 0.000000
2023-10-19 19:45:00,735 epoch 7 - iter 801/893 - loss 0.28458619 - time (sec): 20.47 - samples/sec: 10864.55 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:45:03,071 epoch 7 - iter 890/893 - loss 0.28601360 - time (sec): 22.81 - samples/sec: 10877.94 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:45:03,142 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:03,142 EPOCH 7 done: loss 0.2855 - lr: 0.000010
2023-10-19 19:45:06,007 DEV : loss 0.20663976669311523 - f1-score (micro avg) 0.4633
2023-10-19 19:45:06,021 saving best model
2023-10-19 19:45:06,054 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:08,190 epoch 8 - iter 89/893 - loss 0.29246080 - time (sec): 2.13 - samples/sec: 11859.54 - lr: 0.000010 - momentum: 0.000000
2023-10-19 19:45:10,454 epoch 8 - iter 178/893 - loss 0.29078851 - time (sec): 4.40 - samples/sec: 11412.92 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:45:12,810 epoch 8 - iter 267/893 - loss 0.28731471 - time (sec): 6.76 - samples/sec: 10883.87 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:45:15,164 epoch 8 - iter 356/893 - loss 0.27721438 - time (sec): 9.11 - samples/sec: 10866.31 - lr: 0.000009 - momentum: 0.000000
2023-10-19 19:45:17,447 epoch 8 - iter 445/893 - loss 0.27802466 - time (sec): 11.39 - samples/sec: 10804.34 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:45:19,738 epoch 8 - iter 534/893 - loss 0.27408626 - time (sec): 13.68 - samples/sec: 10978.00 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:45:21,951 epoch 8 - iter 623/893 - loss 0.27492437 - time (sec): 15.90 - samples/sec: 10899.44 - lr: 0.000008 - momentum: 0.000000
2023-10-19 19:45:24,239 epoch 8 - iter 712/893 - loss 0.27135379 - time (sec): 18.18 - samples/sec: 10916.13 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:45:26,479 epoch 8 - iter 801/893 - loss 0.27250084 - time (sec): 20.42 - samples/sec: 10985.45 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:45:28,773 epoch 8 - iter 890/893 - loss 0.27431573 - time (sec): 22.72 - samples/sec: 10904.35 - lr: 0.000007 - momentum: 0.000000
2023-10-19 19:45:28,850 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:28,850 EPOCH 8 done: loss 0.2736 - lr: 0.000007
2023-10-19 19:45:31,203 DEV : loss 0.2064720094203949 - f1-score (micro avg) 0.4789
2023-10-19 19:45:31,217 saving best model
2023-10-19 19:45:31,253 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:33,960 epoch 9 - iter 89/893 - loss 0.26816572 - time (sec): 2.71 - samples/sec: 8997.54 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:45:36,213 epoch 9 - iter 178/893 - loss 0.28006657 - time (sec): 4.96 - samples/sec: 9839.05 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:45:38,341 epoch 9 - iter 267/893 - loss 0.26771406 - time (sec): 7.09 - samples/sec: 10273.57 - lr: 0.000006 - momentum: 0.000000
2023-10-19 19:45:40,606 epoch 9 - iter 356/893 - loss 0.27004687 - time (sec): 9.35 - samples/sec: 10530.02 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:45:42,932 epoch 9 - iter 445/893 - loss 0.26765570 - time (sec): 11.68 - samples/sec: 10486.10 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:45:45,279 epoch 9 - iter 534/893 - loss 0.26724294 - time (sec): 14.03 - samples/sec: 10598.37 - lr: 0.000005 - momentum: 0.000000
2023-10-19 19:45:47,526 epoch 9 - iter 623/893 - loss 0.26654924 - time (sec): 16.27 - samples/sec: 10579.28 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:45:49,904 epoch 9 - iter 712/893 - loss 0.26579743 - time (sec): 18.65 - samples/sec: 10618.05 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:45:52,282 epoch 9 - iter 801/893 - loss 0.26907596 - time (sec): 21.03 - samples/sec: 10590.61 - lr: 0.000004 - momentum: 0.000000
2023-10-19 19:45:54,610 epoch 9 - iter 890/893 - loss 0.26503635 - time (sec): 23.36 - samples/sec: 10613.49 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:45:54,686 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:54,687 EPOCH 9 done: loss 0.2649 - lr: 0.000003
2023-10-19 19:45:57,052 DEV : loss 0.20236296951770782 - f1-score (micro avg) 0.4815
2023-10-19 19:45:57,067 saving best model
2023-10-19 19:45:57,103 ----------------------------------------------------------------------------------------------------
2023-10-19 19:45:59,463 epoch 10 - iter 89/893 - loss 0.27750107 - time (sec): 2.36 - samples/sec: 10578.50 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:46:01,785 epoch 10 - iter 178/893 - loss 0.27064008 - time (sec): 4.68 - samples/sec: 10750.61 - lr: 0.000003 - momentum: 0.000000
2023-10-19 19:46:04,150 epoch 10 - iter 267/893 - loss 0.27340003 - time (sec): 7.05 - samples/sec: 10784.09 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:46:06,451 epoch 10 - iter 356/893 - loss 0.26540687 - time (sec): 9.35 - samples/sec: 10629.43 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:46:08,741 epoch 10 - iter 445/893 - loss 0.26753954 - time (sec): 11.64 - samples/sec: 10585.41 - lr: 0.000002 - momentum: 0.000000
2023-10-19 19:46:11,090 epoch 10 - iter 534/893 - loss 0.26905903 - time (sec): 13.99 - samples/sec: 10577.08 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:46:13,346 epoch 10 - iter 623/893 - loss 0.26904401 - time (sec): 16.24 - samples/sec: 10644.31 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:46:15,712 epoch 10 - iter 712/893 - loss 0.26416451 - time (sec): 18.61 - samples/sec: 10650.66 - lr: 0.000001 - momentum: 0.000000
2023-10-19 19:46:17,960 epoch 10 - iter 801/893 - loss 0.26399106 - time (sec): 20.86 - samples/sec: 10656.47 - lr: 0.000000 - momentum: 0.000000
2023-10-19 19:46:20,248 epoch 10 - iter 890/893 - loss 0.26342787 - time (sec): 23.15 - samples/sec: 10724.28 - lr: 0.000000 - momentum: 0.000000
2023-10-19 19:46:20,316 ----------------------------------------------------------------------------------------------------
2023-10-19 19:46:20,316 EPOCH 10 done: loss 0.2634 - lr: 0.000000
2023-10-19 19:46:23,177 DEV : loss 0.20162628591060638 - f1-score (micro avg) 0.4804
2023-10-19 19:46:23,221 ----------------------------------------------------------------------------------------------------
2023-10-19 19:46:23,222 Loading model from best epoch ...
2023-10-19 19:46:23,309 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 19:46:27,910
Results:
- F-score (micro) 0.3717
- F-score (macro) 0.2097
- Accuracy 0.2385
By class:
precision recall f1-score support
LOC 0.3806 0.4776 0.4237 1095
PER 0.3604 0.4348 0.3941 1012
ORG 0.0431 0.0140 0.0211 357
HumanProd 0.0000 0.0000 0.0000 33
micro avg 0.3571 0.3877 0.3717 2497
macro avg 0.1960 0.2316 0.2097 2497
weighted avg 0.3191 0.3877 0.3485 2497
2023-10-19 19:46:27,910 ----------------------------------------------------------------------------------------------------