2023-10-19 19:42:10,225 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,226 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 19:42:10,226 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,226 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-19 19:42:10,226 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,226 Train: 7142 sentences 2023-10-19 19:42:10,226 (train_with_dev=False, train_with_test=False) 2023-10-19 19:42:10,226 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,226 Training Params: 2023-10-19 19:42:10,226 - learning_rate: "3e-05" 2023-10-19 19:42:10,226 - mini_batch_size: "8" 2023-10-19 19:42:10,226 - max_epochs: "10" 2023-10-19 19:42:10,226 - shuffle: "True" 2023-10-19 19:42:10,226 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,226 Plugins: 2023-10-19 19:42:10,226 - TensorboardLogger 2023-10-19 19:42:10,226 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 19:42:10,227 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,227 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 19:42:10,227 - metric: "('micro avg', 'f1-score')" 2023-10-19 19:42:10,227 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,227 Computation: 2023-10-19 19:42:10,227 - compute on device: cuda:0 2023-10-19 19:42:10,227 - embedding storage: none 2023-10-19 19:42:10,227 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,227 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-19 19:42:10,227 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,227 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:10,227 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 19:42:12,499 epoch 1 - iter 89/893 - loss 3.42294852 - time (sec): 2.27 - samples/sec: 11786.11 - lr: 0.000003 - momentum: 0.000000 2023-10-19 19:42:14,750 epoch 1 - iter 178/893 - loss 3.24973616 - time (sec): 4.52 - samples/sec: 11380.29 - lr: 0.000006 - momentum: 0.000000 2023-10-19 19:42:17,036 epoch 1 - iter 267/893 - loss 2.92472187 - time (sec): 6.81 - samples/sec: 11321.04 - lr: 0.000009 - momentum: 0.000000 2023-10-19 19:42:19,445 epoch 1 - iter 356/893 - loss 2.58407417 - time (sec): 9.22 - samples/sec: 10852.87 - lr: 0.000012 - momentum: 0.000000 2023-10-19 19:42:21,815 epoch 1 - iter 445/893 - loss 2.24270513 - time (sec): 11.59 - samples/sec: 10780.32 - lr: 0.000015 - momentum: 0.000000 2023-10-19 19:42:24,038 epoch 1 - iter 534/893 - loss 1.99303527 - time (sec): 13.81 - samples/sec: 10815.88 - lr: 0.000018 - momentum: 0.000000 2023-10-19 19:42:26,284 epoch 1 - iter 623/893 - loss 1.81063405 - time (sec): 16.06 - samples/sec: 10852.24 - lr: 0.000021 - momentum: 0.000000 2023-10-19 19:42:28,688 epoch 1 - iter 712/893 - loss 1.66666224 - time (sec): 18.46 - samples/sec: 10870.25 - lr: 0.000024 - momentum: 0.000000 2023-10-19 19:42:30,952 epoch 1 - iter 801/893 - loss 1.55222766 - time (sec): 20.72 - samples/sec: 10924.19 - lr: 0.000027 - momentum: 0.000000 2023-10-19 19:42:33,153 epoch 1 - iter 890/893 - loss 1.46483452 - time (sec): 22.93 - samples/sec: 10827.74 - lr: 0.000030 - momentum: 0.000000 2023-10-19 19:42:33,225 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:33,225 EPOCH 1 done: loss 1.4633 - lr: 0.000030 2023-10-19 19:42:34,670 DEV : loss 0.37004899978637695 - f1-score (micro avg) 0.0 2023-10-19 19:42:34,685 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:36,937 epoch 2 - iter 89/893 - loss 0.56991310 - time (sec): 2.25 - samples/sec: 10827.79 - lr: 0.000030 - momentum: 0.000000 2023-10-19 19:42:39,232 epoch 2 - iter 178/893 - loss 0.51829563 - time (sec): 4.55 - samples/sec: 10988.57 - lr: 0.000029 - momentum: 0.000000 2023-10-19 19:42:41,570 epoch 2 - iter 267/893 - loss 0.51918911 - time (sec): 6.88 - samples/sec: 10943.58 - lr: 0.000029 - momentum: 0.000000 2023-10-19 19:42:43,885 epoch 2 - iter 356/893 - loss 0.50708102 - time (sec): 9.20 - samples/sec: 10975.86 - lr: 0.000029 - momentum: 0.000000 2023-10-19 19:42:46,150 epoch 2 - iter 445/893 - loss 0.49813056 - time (sec): 11.47 - samples/sec: 10761.12 - lr: 0.000028 - momentum: 0.000000 2023-10-19 19:42:48,412 epoch 2 - iter 534/893 - loss 0.48552781 - time (sec): 13.73 - samples/sec: 10827.25 - lr: 0.000028 - momentum: 0.000000 2023-10-19 19:42:50,698 epoch 2 - iter 623/893 - loss 0.48075918 - time (sec): 16.01 - samples/sec: 10836.03 - lr: 0.000028 - momentum: 0.000000 2023-10-19 19:42:52,988 epoch 2 - iter 712/893 - loss 0.48102198 - time (sec): 18.30 - samples/sec: 10908.04 - lr: 0.000027 - momentum: 0.000000 2023-10-19 19:42:55,169 epoch 2 - iter 801/893 - loss 0.47517621 - time (sec): 20.48 - samples/sec: 10910.74 - lr: 0.000027 - momentum: 0.000000 2023-10-19 19:42:57,403 epoch 2 - iter 890/893 - loss 0.46873055 - time (sec): 22.72 - samples/sec: 10918.99 - lr: 0.000027 - momentum: 0.000000 2023-10-19 19:42:57,476 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:42:57,476 EPOCH 2 done: loss 0.4686 - lr: 0.000027 2023-10-19 19:42:59,798 DEV : loss 0.27531182765960693 - f1-score (micro avg) 0.271 2023-10-19 19:42:59,812 saving best model 2023-10-19 19:42:59,842 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:43:02,214 epoch 3 - iter 89/893 - loss 0.38596650 - time (sec): 2.37 - samples/sec: 9927.69 - lr: 0.000026 - momentum: 0.000000 2023-10-19 19:43:04,319 epoch 3 - iter 178/893 - loss 0.39477050 - time (sec): 4.48 - samples/sec: 10631.79 - lr: 0.000026 - momentum: 0.000000 2023-10-19 19:43:06,573 epoch 3 - iter 267/893 - loss 0.40926338 - time (sec): 6.73 - samples/sec: 10676.77 - lr: 0.000026 - momentum: 0.000000 2023-10-19 19:43:08,797 epoch 3 - iter 356/893 - loss 0.39975922 - time (sec): 8.95 - samples/sec: 10767.85 - lr: 0.000025 - momentum: 0.000000 2023-10-19 19:43:11,077 epoch 3 - iter 445/893 - loss 0.39903730 - time (sec): 11.23 - samples/sec: 10745.31 - lr: 0.000025 - momentum: 0.000000 2023-10-19 19:43:13,421 epoch 3 - iter 534/893 - loss 0.39669463 - time (sec): 13.58 - samples/sec: 10844.31 - lr: 0.000025 - momentum: 0.000000 2023-10-19 19:43:15,674 epoch 3 - iter 623/893 - loss 0.39311498 - time (sec): 15.83 - samples/sec: 10812.31 - lr: 0.000024 - momentum: 0.000000 2023-10-19 19:43:17,971 epoch 3 - iter 712/893 - loss 0.39258502 - time (sec): 18.13 - samples/sec: 10856.97 - lr: 0.000024 - momentum: 0.000000 2023-10-19 19:43:20,300 epoch 3 - iter 801/893 - loss 0.38428342 - time (sec): 20.46 - samples/sec: 10923.44 - lr: 0.000024 - momentum: 0.000000 2023-10-19 19:43:22,576 epoch 3 - iter 890/893 - loss 0.38254796 - time (sec): 22.73 - samples/sec: 10905.73 - lr: 0.000023 - momentum: 0.000000 2023-10-19 19:43:22,657 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:43:22,657 EPOCH 3 done: loss 0.3827 - lr: 0.000023 2023-10-19 19:43:25,008 DEV : loss 0.24437254667282104 - f1-score (micro avg) 0.3806 2023-10-19 19:43:25,023 saving best model 2023-10-19 19:43:25,061 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:43:27,203 epoch 4 - iter 89/893 - loss 0.38266866 - time (sec): 2.14 - samples/sec: 10963.05 - lr: 0.000023 - momentum: 0.000000 2023-10-19 19:43:29,424 epoch 4 - iter 178/893 - loss 0.36491384 - time (sec): 4.36 - samples/sec: 11065.18 - lr: 0.000023 - momentum: 0.000000 2023-10-19 19:43:31,745 epoch 4 - iter 267/893 - loss 0.36615042 - time (sec): 6.68 - samples/sec: 11134.67 - lr: 0.000022 - momentum: 0.000000 2023-10-19 19:43:34,104 epoch 4 - iter 356/893 - loss 0.36528656 - time (sec): 9.04 - samples/sec: 10703.86 - lr: 0.000022 - momentum: 0.000000 2023-10-19 19:43:36,429 epoch 4 - iter 445/893 - loss 0.35886848 - time (sec): 11.37 - samples/sec: 10649.20 - lr: 0.000022 - momentum: 0.000000 2023-10-19 19:43:38,724 epoch 4 - iter 534/893 - loss 0.35993741 - time (sec): 13.66 - samples/sec: 10697.14 - lr: 0.000021 - momentum: 0.000000 2023-10-19 19:43:41,003 epoch 4 - iter 623/893 - loss 0.35205696 - time (sec): 15.94 - samples/sec: 10784.30 - lr: 0.000021 - momentum: 0.000000 2023-10-19 19:43:43,250 epoch 4 - iter 712/893 - loss 0.35119897 - time (sec): 18.19 - samples/sec: 10840.13 - lr: 0.000021 - momentum: 0.000000 2023-10-19 19:43:45,525 epoch 4 - iter 801/893 - loss 0.34686215 - time (sec): 20.46 - samples/sec: 10881.75 - lr: 0.000020 - momentum: 0.000000 2023-10-19 19:43:47,797 epoch 4 - iter 890/893 - loss 0.34408543 - time (sec): 22.74 - samples/sec: 10913.42 - lr: 0.000020 - momentum: 0.000000 2023-10-19 19:43:47,875 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:43:47,875 EPOCH 4 done: loss 0.3444 - lr: 0.000020 2023-10-19 19:43:50,697 DEV : loss 0.22855891287326813 - f1-score (micro avg) 0.4203 2023-10-19 19:43:50,711 saving best model 2023-10-19 19:43:50,743 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:43:52,824 epoch 5 - iter 89/893 - loss 0.31466381 - time (sec): 2.08 - samples/sec: 12234.69 - lr: 0.000020 - momentum: 0.000000 2023-10-19 19:43:54,783 epoch 5 - iter 178/893 - loss 0.33259770 - time (sec): 4.04 - samples/sec: 12497.44 - lr: 0.000019 - momentum: 0.000000 2023-10-19 19:43:56,784 epoch 5 - iter 267/893 - loss 0.33327894 - time (sec): 6.04 - samples/sec: 12653.96 - lr: 0.000019 - momentum: 0.000000 2023-10-19 19:43:58,841 epoch 5 - iter 356/893 - loss 0.33596139 - time (sec): 8.10 - samples/sec: 12185.75 - lr: 0.000019 - momentum: 0.000000 2023-10-19 19:44:01,139 epoch 5 - iter 445/893 - loss 0.33259103 - time (sec): 10.39 - samples/sec: 11893.74 - lr: 0.000018 - momentum: 0.000000 2023-10-19 19:44:03,346 epoch 5 - iter 534/893 - loss 0.32763605 - time (sec): 12.60 - samples/sec: 11795.05 - lr: 0.000018 - momentum: 0.000000 2023-10-19 19:44:05,585 epoch 5 - iter 623/893 - loss 0.32488129 - time (sec): 14.84 - samples/sec: 11686.67 - lr: 0.000018 - momentum: 0.000000 2023-10-19 19:44:07,797 epoch 5 - iter 712/893 - loss 0.32110686 - time (sec): 17.05 - samples/sec: 11715.43 - lr: 0.000017 - momentum: 0.000000 2023-10-19 19:44:10,044 epoch 5 - iter 801/893 - loss 0.32002079 - time (sec): 19.30 - samples/sec: 11627.50 - lr: 0.000017 - momentum: 0.000000 2023-10-19 19:44:12,284 epoch 5 - iter 890/893 - loss 0.31595157 - time (sec): 21.54 - samples/sec: 11524.77 - lr: 0.000017 - momentum: 0.000000 2023-10-19 19:44:12,355 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:44:12,356 EPOCH 5 done: loss 0.3164 - lr: 0.000017 2023-10-19 19:44:14,717 DEV : loss 0.21935363113880157 - f1-score (micro avg) 0.4215 2023-10-19 19:44:14,733 saving best model 2023-10-19 19:44:14,768 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:44:17,085 epoch 6 - iter 89/893 - loss 0.29676919 - time (sec): 2.32 - samples/sec: 10650.34 - lr: 0.000016 - momentum: 0.000000 2023-10-19 19:44:19,366 epoch 6 - iter 178/893 - loss 0.28651342 - time (sec): 4.60 - samples/sec: 10997.91 - lr: 0.000016 - momentum: 0.000000 2023-10-19 19:44:21,619 epoch 6 - iter 267/893 - loss 0.29160014 - time (sec): 6.85 - samples/sec: 11104.38 - lr: 0.000016 - momentum: 0.000000 2023-10-19 19:44:24,327 epoch 6 - iter 356/893 - loss 0.29690307 - time (sec): 9.56 - samples/sec: 10517.72 - lr: 0.000015 - momentum: 0.000000 2023-10-19 19:44:26,566 epoch 6 - iter 445/893 - loss 0.29916859 - time (sec): 11.80 - samples/sec: 10723.48 - lr: 0.000015 - momentum: 0.000000 2023-10-19 19:44:28,793 epoch 6 - iter 534/893 - loss 0.29847625 - time (sec): 14.02 - samples/sec: 10754.82 - lr: 0.000015 - momentum: 0.000000 2023-10-19 19:44:31,010 epoch 6 - iter 623/893 - loss 0.29921952 - time (sec): 16.24 - samples/sec: 10726.73 - lr: 0.000014 - momentum: 0.000000 2023-10-19 19:44:33,255 epoch 6 - iter 712/893 - loss 0.29877886 - time (sec): 18.49 - samples/sec: 10749.20 - lr: 0.000014 - momentum: 0.000000 2023-10-19 19:44:35,494 epoch 6 - iter 801/893 - loss 0.29509483 - time (sec): 20.73 - samples/sec: 10769.45 - lr: 0.000014 - momentum: 0.000000 2023-10-19 19:44:37,767 epoch 6 - iter 890/893 - loss 0.29708723 - time (sec): 23.00 - samples/sec: 10793.38 - lr: 0.000013 - momentum: 0.000000 2023-10-19 19:44:37,841 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:44:37,841 EPOCH 6 done: loss 0.2976 - lr: 0.000013 2023-10-19 19:44:40,209 DEV : loss 0.21392673254013062 - f1-score (micro avg) 0.4466 2023-10-19 19:44:40,224 saving best model 2023-10-19 19:44:40,259 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:44:42,366 epoch 7 - iter 89/893 - loss 0.28381710 - time (sec): 2.11 - samples/sec: 10978.42 - lr: 0.000013 - momentum: 0.000000 2023-10-19 19:44:44,619 epoch 7 - iter 178/893 - loss 0.28769399 - time (sec): 4.36 - samples/sec: 11038.25 - lr: 0.000013 - momentum: 0.000000 2023-10-19 19:44:46,970 epoch 7 - iter 267/893 - loss 0.27591167 - time (sec): 6.71 - samples/sec: 10940.40 - lr: 0.000012 - momentum: 0.000000 2023-10-19 19:44:49,224 epoch 7 - iter 356/893 - loss 0.28249607 - time (sec): 8.96 - samples/sec: 10912.64 - lr: 0.000012 - momentum: 0.000000 2023-10-19 19:44:51,474 epoch 7 - iter 445/893 - loss 0.28466695 - time (sec): 11.21 - samples/sec: 10979.50 - lr: 0.000012 - momentum: 0.000000 2023-10-19 19:44:53,826 epoch 7 - iter 534/893 - loss 0.28087036 - time (sec): 13.57 - samples/sec: 10976.81 - lr: 0.000011 - momentum: 0.000000 2023-10-19 19:44:56,168 epoch 7 - iter 623/893 - loss 0.28248157 - time (sec): 15.91 - samples/sec: 10935.88 - lr: 0.000011 - momentum: 0.000000 2023-10-19 19:44:58,432 epoch 7 - iter 712/893 - loss 0.28429254 - time (sec): 18.17 - samples/sec: 10891.84 - lr: 0.000011 - momentum: 0.000000 2023-10-19 19:45:00,735 epoch 7 - iter 801/893 - loss 0.28458619 - time (sec): 20.47 - samples/sec: 10864.55 - lr: 0.000010 - momentum: 0.000000 2023-10-19 19:45:03,071 epoch 7 - iter 890/893 - loss 0.28601360 - time (sec): 22.81 - samples/sec: 10877.94 - lr: 0.000010 - momentum: 0.000000 2023-10-19 19:45:03,142 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:03,142 EPOCH 7 done: loss 0.2855 - lr: 0.000010 2023-10-19 19:45:06,007 DEV : loss 0.20663976669311523 - f1-score (micro avg) 0.4633 2023-10-19 19:45:06,021 saving best model 2023-10-19 19:45:06,054 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:08,190 epoch 8 - iter 89/893 - loss 0.29246080 - time (sec): 2.13 - samples/sec: 11859.54 - lr: 0.000010 - momentum: 0.000000 2023-10-19 19:45:10,454 epoch 8 - iter 178/893 - loss 0.29078851 - time (sec): 4.40 - samples/sec: 11412.92 - lr: 0.000009 - momentum: 0.000000 2023-10-19 19:45:12,810 epoch 8 - iter 267/893 - loss 0.28731471 - time (sec): 6.76 - samples/sec: 10883.87 - lr: 0.000009 - momentum: 0.000000 2023-10-19 19:45:15,164 epoch 8 - iter 356/893 - loss 0.27721438 - time (sec): 9.11 - samples/sec: 10866.31 - lr: 0.000009 - momentum: 0.000000 2023-10-19 19:45:17,447 epoch 8 - iter 445/893 - loss 0.27802466 - time (sec): 11.39 - samples/sec: 10804.34 - lr: 0.000008 - momentum: 0.000000 2023-10-19 19:45:19,738 epoch 8 - iter 534/893 - loss 0.27408626 - time (sec): 13.68 - samples/sec: 10978.00 - lr: 0.000008 - momentum: 0.000000 2023-10-19 19:45:21,951 epoch 8 - iter 623/893 - loss 0.27492437 - time (sec): 15.90 - samples/sec: 10899.44 - lr: 0.000008 - momentum: 0.000000 2023-10-19 19:45:24,239 epoch 8 - iter 712/893 - loss 0.27135379 - time (sec): 18.18 - samples/sec: 10916.13 - lr: 0.000007 - momentum: 0.000000 2023-10-19 19:45:26,479 epoch 8 - iter 801/893 - loss 0.27250084 - time (sec): 20.42 - samples/sec: 10985.45 - lr: 0.000007 - momentum: 0.000000 2023-10-19 19:45:28,773 epoch 8 - iter 890/893 - loss 0.27431573 - time (sec): 22.72 - samples/sec: 10904.35 - lr: 0.000007 - momentum: 0.000000 2023-10-19 19:45:28,850 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:28,850 EPOCH 8 done: loss 0.2736 - lr: 0.000007 2023-10-19 19:45:31,203 DEV : loss 0.2064720094203949 - f1-score (micro avg) 0.4789 2023-10-19 19:45:31,217 saving best model 2023-10-19 19:45:31,253 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:33,960 epoch 9 - iter 89/893 - loss 0.26816572 - time (sec): 2.71 - samples/sec: 8997.54 - lr: 0.000006 - momentum: 0.000000 2023-10-19 19:45:36,213 epoch 9 - iter 178/893 - loss 0.28006657 - time (sec): 4.96 - samples/sec: 9839.05 - lr: 0.000006 - momentum: 0.000000 2023-10-19 19:45:38,341 epoch 9 - iter 267/893 - loss 0.26771406 - time (sec): 7.09 - samples/sec: 10273.57 - lr: 0.000006 - momentum: 0.000000 2023-10-19 19:45:40,606 epoch 9 - iter 356/893 - loss 0.27004687 - time (sec): 9.35 - samples/sec: 10530.02 - lr: 0.000005 - momentum: 0.000000 2023-10-19 19:45:42,932 epoch 9 - iter 445/893 - loss 0.26765570 - time (sec): 11.68 - samples/sec: 10486.10 - lr: 0.000005 - momentum: 0.000000 2023-10-19 19:45:45,279 epoch 9 - iter 534/893 - loss 0.26724294 - time (sec): 14.03 - samples/sec: 10598.37 - lr: 0.000005 - momentum: 0.000000 2023-10-19 19:45:47,526 epoch 9 - iter 623/893 - loss 0.26654924 - time (sec): 16.27 - samples/sec: 10579.28 - lr: 0.000004 - momentum: 0.000000 2023-10-19 19:45:49,904 epoch 9 - iter 712/893 - loss 0.26579743 - time (sec): 18.65 - samples/sec: 10618.05 - lr: 0.000004 - momentum: 0.000000 2023-10-19 19:45:52,282 epoch 9 - iter 801/893 - loss 0.26907596 - time (sec): 21.03 - samples/sec: 10590.61 - lr: 0.000004 - momentum: 0.000000 2023-10-19 19:45:54,610 epoch 9 - iter 890/893 - loss 0.26503635 - time (sec): 23.36 - samples/sec: 10613.49 - lr: 0.000003 - momentum: 0.000000 2023-10-19 19:45:54,686 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:54,687 EPOCH 9 done: loss 0.2649 - lr: 0.000003 2023-10-19 19:45:57,052 DEV : loss 0.20236296951770782 - f1-score (micro avg) 0.4815 2023-10-19 19:45:57,067 saving best model 2023-10-19 19:45:57,103 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:45:59,463 epoch 10 - iter 89/893 - loss 0.27750107 - time (sec): 2.36 - samples/sec: 10578.50 - lr: 0.000003 - momentum: 0.000000 2023-10-19 19:46:01,785 epoch 10 - iter 178/893 - loss 0.27064008 - time (sec): 4.68 - samples/sec: 10750.61 - lr: 0.000003 - momentum: 0.000000 2023-10-19 19:46:04,150 epoch 10 - iter 267/893 - loss 0.27340003 - time (sec): 7.05 - samples/sec: 10784.09 - lr: 0.000002 - momentum: 0.000000 2023-10-19 19:46:06,451 epoch 10 - iter 356/893 - loss 0.26540687 - time (sec): 9.35 - samples/sec: 10629.43 - lr: 0.000002 - momentum: 0.000000 2023-10-19 19:46:08,741 epoch 10 - iter 445/893 - loss 0.26753954 - time (sec): 11.64 - samples/sec: 10585.41 - lr: 0.000002 - momentum: 0.000000 2023-10-19 19:46:11,090 epoch 10 - iter 534/893 - loss 0.26905903 - time (sec): 13.99 - samples/sec: 10577.08 - lr: 0.000001 - momentum: 0.000000 2023-10-19 19:46:13,346 epoch 10 - iter 623/893 - loss 0.26904401 - time (sec): 16.24 - samples/sec: 10644.31 - lr: 0.000001 - momentum: 0.000000 2023-10-19 19:46:15,712 epoch 10 - iter 712/893 - loss 0.26416451 - time (sec): 18.61 - samples/sec: 10650.66 - lr: 0.000001 - momentum: 0.000000 2023-10-19 19:46:17,960 epoch 10 - iter 801/893 - loss 0.26399106 - time (sec): 20.86 - samples/sec: 10656.47 - lr: 0.000000 - momentum: 0.000000 2023-10-19 19:46:20,248 epoch 10 - iter 890/893 - loss 0.26342787 - time (sec): 23.15 - samples/sec: 10724.28 - lr: 0.000000 - momentum: 0.000000 2023-10-19 19:46:20,316 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:46:20,316 EPOCH 10 done: loss 0.2634 - lr: 0.000000 2023-10-19 19:46:23,177 DEV : loss 0.20162628591060638 - f1-score (micro avg) 0.4804 2023-10-19 19:46:23,221 ---------------------------------------------------------------------------------------------------- 2023-10-19 19:46:23,222 Loading model from best epoch ... 2023-10-19 19:46:23,309 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-19 19:46:27,910 Results: - F-score (micro) 0.3717 - F-score (macro) 0.2097 - Accuracy 0.2385 By class: precision recall f1-score support LOC 0.3806 0.4776 0.4237 1095 PER 0.3604 0.4348 0.3941 1012 ORG 0.0431 0.0140 0.0211 357 HumanProd 0.0000 0.0000 0.0000 33 micro avg 0.3571 0.3877 0.3717 2497 macro avg 0.1960 0.2316 0.2097 2497 weighted avg 0.3191 0.3877 0.3485 2497 2023-10-19 19:46:27,910 ----------------------------------------------------------------------------------------------------