2023-10-13 11:07:50,255 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,256 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 11:07:50,256 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,256 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Train: 966 sentences 2023-10-13 11:07:50,257 (train_with_dev=False, train_with_test=False) 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Training Params: 2023-10-13 11:07:50,257 - learning_rate: "5e-05" 2023-10-13 11:07:50,257 - mini_batch_size: "4" 2023-10-13 11:07:50,257 - max_epochs: "10" 2023-10-13 11:07:50,257 - shuffle: "True" 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Plugins: 2023-10-13 11:07:50,257 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 11:07:50,257 - metric: "('micro avg', 'f1-score')" 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Computation: 2023-10-13 11:07:50,257 - compute on device: cuda:0 2023-10-13 11:07:50,257 - embedding storage: none 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:50,257 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:07:51,349 epoch 1 - iter 24/242 - loss 3.18371892 - time (sec): 1.09 - samples/sec: 2014.36 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:07:52,451 epoch 1 - iter 48/242 - loss 2.57596024 - time (sec): 2.19 - samples/sec: 2197.83 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:07:53,520 epoch 1 - iter 72/242 - loss 1.95454570 - time (sec): 3.26 - samples/sec: 2210.52 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:07:54,630 epoch 1 - iter 96/242 - loss 1.58160402 - time (sec): 4.37 - samples/sec: 2274.47 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:07:55,735 epoch 1 - iter 120/242 - loss 1.37388781 - time (sec): 5.48 - samples/sec: 2274.02 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:07:56,807 epoch 1 - iter 144/242 - loss 1.22479270 - time (sec): 6.55 - samples/sec: 2225.17 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:07:57,917 epoch 1 - iter 168/242 - loss 1.10074637 - time (sec): 7.66 - samples/sec: 2240.57 - lr: 0.000035 - momentum: 0.000000 2023-10-13 11:07:58,997 epoch 1 - iter 192/242 - loss 1.00418203 - time (sec): 8.74 - samples/sec: 2238.04 - lr: 0.000039 - momentum: 0.000000 2023-10-13 11:08:00,089 epoch 1 - iter 216/242 - loss 0.92938882 - time (sec): 9.83 - samples/sec: 2223.02 - lr: 0.000044 - momentum: 0.000000 2023-10-13 11:08:01,205 epoch 1 - iter 240/242 - loss 0.85286162 - time (sec): 10.95 - samples/sec: 2236.10 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:08:01,298 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:01,299 EPOCH 1 done: loss 0.8448 - lr: 0.000049 2023-10-13 11:08:02,174 DEV : loss 0.20342281460762024 - f1-score (micro avg) 0.6 2023-10-13 11:08:02,179 saving best model 2023-10-13 11:08:02,540 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:03,647 epoch 2 - iter 24/242 - loss 0.22758628 - time (sec): 1.11 - samples/sec: 2334.16 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:08:04,716 epoch 2 - iter 48/242 - loss 0.21528058 - time (sec): 2.17 - samples/sec: 2390.69 - lr: 0.000049 - momentum: 0.000000 2023-10-13 11:08:05,784 epoch 2 - iter 72/242 - loss 0.22048385 - time (sec): 3.24 - samples/sec: 2270.24 - lr: 0.000048 - momentum: 0.000000 2023-10-13 11:08:06,890 epoch 2 - iter 96/242 - loss 0.20683587 - time (sec): 4.35 - samples/sec: 2282.48 - lr: 0.000048 - momentum: 0.000000 2023-10-13 11:08:07,953 epoch 2 - iter 120/242 - loss 0.19005376 - time (sec): 5.41 - samples/sec: 2263.80 - lr: 0.000047 - momentum: 0.000000 2023-10-13 11:08:09,036 epoch 2 - iter 144/242 - loss 0.18792079 - time (sec): 6.49 - samples/sec: 2270.81 - lr: 0.000047 - momentum: 0.000000 2023-10-13 11:08:10,154 epoch 2 - iter 168/242 - loss 0.18222929 - time (sec): 7.61 - samples/sec: 2270.44 - lr: 0.000046 - momentum: 0.000000 2023-10-13 11:08:11,258 epoch 2 - iter 192/242 - loss 0.17949989 - time (sec): 8.72 - samples/sec: 2269.25 - lr: 0.000046 - momentum: 0.000000 2023-10-13 11:08:12,348 epoch 2 - iter 216/242 - loss 0.17136972 - time (sec): 9.81 - samples/sec: 2271.24 - lr: 0.000045 - momentum: 0.000000 2023-10-13 11:08:13,402 epoch 2 - iter 240/242 - loss 0.16899960 - time (sec): 10.86 - samples/sec: 2268.54 - lr: 0.000045 - momentum: 0.000000 2023-10-13 11:08:13,487 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:13,487 EPOCH 2 done: loss 0.1698 - lr: 0.000045 2023-10-13 11:08:14,272 DEV : loss 0.14844480156898499 - f1-score (micro avg) 0.7626 2023-10-13 11:08:14,277 saving best model 2023-10-13 11:08:14,744 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:15,898 epoch 3 - iter 24/242 - loss 0.10777340 - time (sec): 1.15 - samples/sec: 2145.43 - lr: 0.000044 - momentum: 0.000000 2023-10-13 11:08:17,068 epoch 3 - iter 48/242 - loss 0.12615430 - time (sec): 2.32 - samples/sec: 2127.10 - lr: 0.000043 - momentum: 0.000000 2023-10-13 11:08:18,176 epoch 3 - iter 72/242 - loss 0.11486814 - time (sec): 3.43 - samples/sec: 2038.06 - lr: 0.000043 - momentum: 0.000000 2023-10-13 11:08:19,257 epoch 3 - iter 96/242 - loss 0.10526047 - time (sec): 4.51 - samples/sec: 2116.82 - lr: 0.000042 - momentum: 0.000000 2023-10-13 11:08:20,358 epoch 3 - iter 120/242 - loss 0.12266542 - time (sec): 5.61 - samples/sec: 2158.18 - lr: 0.000042 - momentum: 0.000000 2023-10-13 11:08:21,436 epoch 3 - iter 144/242 - loss 0.12778498 - time (sec): 6.69 - samples/sec: 2200.90 - lr: 0.000041 - momentum: 0.000000 2023-10-13 11:08:22,539 epoch 3 - iter 168/242 - loss 0.11820235 - time (sec): 7.79 - samples/sec: 2227.12 - lr: 0.000041 - momentum: 0.000000 2023-10-13 11:08:23,616 epoch 3 - iter 192/242 - loss 0.12499658 - time (sec): 8.87 - samples/sec: 2225.14 - lr: 0.000040 - momentum: 0.000000 2023-10-13 11:08:24,672 epoch 3 - iter 216/242 - loss 0.11906164 - time (sec): 9.92 - samples/sec: 2208.90 - lr: 0.000040 - momentum: 0.000000 2023-10-13 11:08:25,750 epoch 3 - iter 240/242 - loss 0.11394204 - time (sec): 11.00 - samples/sec: 2227.03 - lr: 0.000039 - momentum: 0.000000 2023-10-13 11:08:25,836 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:25,837 EPOCH 3 done: loss 0.1131 - lr: 0.000039 2023-10-13 11:08:26,656 DEV : loss 0.1403370201587677 - f1-score (micro avg) 0.8243 2023-10-13 11:08:26,662 saving best model 2023-10-13 11:08:27,161 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:28,331 epoch 4 - iter 24/242 - loss 0.06596586 - time (sec): 1.17 - samples/sec: 2068.96 - lr: 0.000038 - momentum: 0.000000 2023-10-13 11:08:29,552 epoch 4 - iter 48/242 - loss 0.06831484 - time (sec): 2.39 - samples/sec: 2106.37 - lr: 0.000038 - momentum: 0.000000 2023-10-13 11:08:30,743 epoch 4 - iter 72/242 - loss 0.06654065 - time (sec): 3.58 - samples/sec: 2055.36 - lr: 0.000037 - momentum: 0.000000 2023-10-13 11:08:31,906 epoch 4 - iter 96/242 - loss 0.06398005 - time (sec): 4.74 - samples/sec: 2067.72 - lr: 0.000037 - momentum: 0.000000 2023-10-13 11:08:33,021 epoch 4 - iter 120/242 - loss 0.06641754 - time (sec): 5.86 - samples/sec: 2114.27 - lr: 0.000036 - momentum: 0.000000 2023-10-13 11:08:34,089 epoch 4 - iter 144/242 - loss 0.06951926 - time (sec): 6.93 - samples/sec: 2158.36 - lr: 0.000036 - momentum: 0.000000 2023-10-13 11:08:35,176 epoch 4 - iter 168/242 - loss 0.06799826 - time (sec): 8.01 - samples/sec: 2156.32 - lr: 0.000035 - momentum: 0.000000 2023-10-13 11:08:36,247 epoch 4 - iter 192/242 - loss 0.07037930 - time (sec): 9.09 - samples/sec: 2148.14 - lr: 0.000035 - momentum: 0.000000 2023-10-13 11:08:37,328 epoch 4 - iter 216/242 - loss 0.07292490 - time (sec): 10.17 - samples/sec: 2155.33 - lr: 0.000034 - momentum: 0.000000 2023-10-13 11:08:38,464 epoch 4 - iter 240/242 - loss 0.07036982 - time (sec): 11.30 - samples/sec: 2177.46 - lr: 0.000033 - momentum: 0.000000 2023-10-13 11:08:38,556 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:38,556 EPOCH 4 done: loss 0.0700 - lr: 0.000033 2023-10-13 11:08:39,398 DEV : loss 0.1509368121623993 - f1-score (micro avg) 0.8365 2023-10-13 11:08:39,404 saving best model 2023-10-13 11:08:39,914 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:41,015 epoch 5 - iter 24/242 - loss 0.07172988 - time (sec): 1.10 - samples/sec: 2410.95 - lr: 0.000033 - momentum: 0.000000 2023-10-13 11:08:42,079 epoch 5 - iter 48/242 - loss 0.07010272 - time (sec): 2.16 - samples/sec: 2314.26 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:08:43,161 epoch 5 - iter 72/242 - loss 0.05357899 - time (sec): 3.24 - samples/sec: 2241.95 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:08:44,258 epoch 5 - iter 96/242 - loss 0.06053123 - time (sec): 4.34 - samples/sec: 2252.25 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:08:45,328 epoch 5 - iter 120/242 - loss 0.06273107 - time (sec): 5.41 - samples/sec: 2268.25 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:08:46,377 epoch 5 - iter 144/242 - loss 0.05890511 - time (sec): 6.46 - samples/sec: 2288.21 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:08:47,452 epoch 5 - iter 168/242 - loss 0.06151768 - time (sec): 7.53 - samples/sec: 2314.99 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:08:48,608 epoch 5 - iter 192/242 - loss 0.05962828 - time (sec): 8.69 - samples/sec: 2285.31 - lr: 0.000029 - momentum: 0.000000 2023-10-13 11:08:49,735 epoch 5 - iter 216/242 - loss 0.05957175 - time (sec): 9.82 - samples/sec: 2286.62 - lr: 0.000028 - momentum: 0.000000 2023-10-13 11:08:50,786 epoch 5 - iter 240/242 - loss 0.05693543 - time (sec): 10.87 - samples/sec: 2264.38 - lr: 0.000028 - momentum: 0.000000 2023-10-13 11:08:50,868 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:50,868 EPOCH 5 done: loss 0.0572 - lr: 0.000028 2023-10-13 11:08:51,759 DEV : loss 0.18690580129623413 - f1-score (micro avg) 0.8143 2023-10-13 11:08:51,766 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:08:53,082 epoch 6 - iter 24/242 - loss 0.05722285 - time (sec): 1.31 - samples/sec: 1970.01 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:08:54,456 epoch 6 - iter 48/242 - loss 0.05244200 - time (sec): 2.69 - samples/sec: 1893.11 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:08:55,814 epoch 6 - iter 72/242 - loss 0.04904005 - time (sec): 4.05 - samples/sec: 1902.82 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:08:57,057 epoch 6 - iter 96/242 - loss 0.04467185 - time (sec): 5.29 - samples/sec: 1854.33 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:08:58,206 epoch 6 - iter 120/242 - loss 0.04420946 - time (sec): 6.44 - samples/sec: 1940.66 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:08:59,280 epoch 6 - iter 144/242 - loss 0.03970356 - time (sec): 7.51 - samples/sec: 1985.99 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:09:00,328 epoch 6 - iter 168/242 - loss 0.04408623 - time (sec): 8.56 - samples/sec: 1993.20 - lr: 0.000024 - momentum: 0.000000 2023-10-13 11:09:01,398 epoch 6 - iter 192/242 - loss 0.04034396 - time (sec): 9.63 - samples/sec: 2014.68 - lr: 0.000023 - momentum: 0.000000 2023-10-13 11:09:02,476 epoch 6 - iter 216/242 - loss 0.03986324 - time (sec): 10.71 - samples/sec: 2039.88 - lr: 0.000023 - momentum: 0.000000 2023-10-13 11:09:03,576 epoch 6 - iter 240/242 - loss 0.03959649 - time (sec): 11.81 - samples/sec: 2083.43 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:09:03,664 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:03,664 EPOCH 6 done: loss 0.0399 - lr: 0.000022 2023-10-13 11:09:04,432 DEV : loss 0.20703579485416412 - f1-score (micro avg) 0.8067 2023-10-13 11:09:04,437 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:05,523 epoch 7 - iter 24/242 - loss 0.01304402 - time (sec): 1.08 - samples/sec: 2354.20 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:09:06,615 epoch 7 - iter 48/242 - loss 0.01891244 - time (sec): 2.18 - samples/sec: 2320.61 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:09:07,679 epoch 7 - iter 72/242 - loss 0.02175415 - time (sec): 3.24 - samples/sec: 2271.91 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:09:08,749 epoch 7 - iter 96/242 - loss 0.02731527 - time (sec): 4.31 - samples/sec: 2262.01 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:09:09,812 epoch 7 - iter 120/242 - loss 0.03234705 - time (sec): 5.37 - samples/sec: 2276.02 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:09:10,923 epoch 7 - iter 144/242 - loss 0.02891570 - time (sec): 6.48 - samples/sec: 2285.96 - lr: 0.000019 - momentum: 0.000000 2023-10-13 11:09:11,986 epoch 7 - iter 168/242 - loss 0.02815719 - time (sec): 7.55 - samples/sec: 2295.21 - lr: 0.000018 - momentum: 0.000000 2023-10-13 11:09:13,061 epoch 7 - iter 192/242 - loss 0.02865721 - time (sec): 8.62 - samples/sec: 2270.16 - lr: 0.000018 - momentum: 0.000000 2023-10-13 11:09:14,140 epoch 7 - iter 216/242 - loss 0.02953733 - time (sec): 9.70 - samples/sec: 2277.81 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:09:15,223 epoch 7 - iter 240/242 - loss 0.02922147 - time (sec): 10.78 - samples/sec: 2274.96 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:09:15,311 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:15,311 EPOCH 7 done: loss 0.0290 - lr: 0.000017 2023-10-13 11:09:16,067 DEV : loss 0.203294038772583 - f1-score (micro avg) 0.8335 2023-10-13 11:09:16,072 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:17,175 epoch 8 - iter 24/242 - loss 0.02027913 - time (sec): 1.10 - samples/sec: 2439.39 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:09:18,248 epoch 8 - iter 48/242 - loss 0.02740789 - time (sec): 2.18 - samples/sec: 2388.95 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:09:19,334 epoch 8 - iter 72/242 - loss 0.02091675 - time (sec): 3.26 - samples/sec: 2361.24 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:09:20,420 epoch 8 - iter 96/242 - loss 0.02309290 - time (sec): 4.35 - samples/sec: 2349.70 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:09:21,490 epoch 8 - iter 120/242 - loss 0.02237206 - time (sec): 5.42 - samples/sec: 2367.97 - lr: 0.000014 - momentum: 0.000000 2023-10-13 11:09:22,599 epoch 8 - iter 144/242 - loss 0.02123187 - time (sec): 6.53 - samples/sec: 2329.24 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:09:23,709 epoch 8 - iter 168/242 - loss 0.02196109 - time (sec): 7.64 - samples/sec: 2288.74 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:09:24,804 epoch 8 - iter 192/242 - loss 0.02266697 - time (sec): 8.73 - samples/sec: 2269.13 - lr: 0.000012 - momentum: 0.000000 2023-10-13 11:09:25,914 epoch 8 - iter 216/242 - loss 0.02158182 - time (sec): 9.84 - samples/sec: 2271.83 - lr: 0.000012 - momentum: 0.000000 2023-10-13 11:09:26,980 epoch 8 - iter 240/242 - loss 0.02105113 - time (sec): 10.91 - samples/sec: 2260.99 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:09:27,061 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:27,061 EPOCH 8 done: loss 0.0210 - lr: 0.000011 2023-10-13 11:09:28,017 DEV : loss 0.19338418543338776 - f1-score (micro avg) 0.8507 2023-10-13 11:09:28,023 saving best model 2023-10-13 11:09:28,540 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:29,691 epoch 9 - iter 24/242 - loss 0.01795266 - time (sec): 1.15 - samples/sec: 2122.41 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:09:30,875 epoch 9 - iter 48/242 - loss 0.01915869 - time (sec): 2.33 - samples/sec: 2182.73 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:09:32,038 epoch 9 - iter 72/242 - loss 0.02158611 - time (sec): 3.49 - samples/sec: 2223.27 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:09:33,189 epoch 9 - iter 96/242 - loss 0.01719087 - time (sec): 4.65 - samples/sec: 2120.24 - lr: 0.000009 - momentum: 0.000000 2023-10-13 11:09:34,331 epoch 9 - iter 120/242 - loss 0.01594553 - time (sec): 5.79 - samples/sec: 2076.69 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:09:35,409 epoch 9 - iter 144/242 - loss 0.01441884 - time (sec): 6.87 - samples/sec: 2112.07 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:09:36,489 epoch 9 - iter 168/242 - loss 0.01395817 - time (sec): 7.95 - samples/sec: 2166.27 - lr: 0.000007 - momentum: 0.000000 2023-10-13 11:09:37,582 epoch 9 - iter 192/242 - loss 0.01254251 - time (sec): 9.04 - samples/sec: 2170.68 - lr: 0.000007 - momentum: 0.000000 2023-10-13 11:09:38,636 epoch 9 - iter 216/242 - loss 0.01437384 - time (sec): 10.09 - samples/sec: 2159.19 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:09:39,741 epoch 9 - iter 240/242 - loss 0.01382778 - time (sec): 11.20 - samples/sec: 2197.18 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:09:39,827 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:39,827 EPOCH 9 done: loss 0.0140 - lr: 0.000006 2023-10-13 11:09:40,667 DEV : loss 0.1840001940727234 - f1-score (micro avg) 0.8553 2023-10-13 11:09:40,673 saving best model 2023-10-13 11:09:41,174 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:42,272 epoch 10 - iter 24/242 - loss 0.00815395 - time (sec): 1.10 - samples/sec: 2217.13 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:09:43,373 epoch 10 - iter 48/242 - loss 0.01267221 - time (sec): 2.20 - samples/sec: 2360.19 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:09:44,445 epoch 10 - iter 72/242 - loss 0.00988114 - time (sec): 3.27 - samples/sec: 2277.47 - lr: 0.000004 - momentum: 0.000000 2023-10-13 11:09:45,514 epoch 10 - iter 96/242 - loss 0.01014757 - time (sec): 4.34 - samples/sec: 2212.92 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:09:46,614 epoch 10 - iter 120/242 - loss 0.00919350 - time (sec): 5.44 - samples/sec: 2298.46 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:09:47,695 epoch 10 - iter 144/242 - loss 0.00878841 - time (sec): 6.52 - samples/sec: 2280.14 - lr: 0.000002 - momentum: 0.000000 2023-10-13 11:09:48,784 epoch 10 - iter 168/242 - loss 0.01029727 - time (sec): 7.61 - samples/sec: 2258.55 - lr: 0.000002 - momentum: 0.000000 2023-10-13 11:09:49,851 epoch 10 - iter 192/242 - loss 0.01213754 - time (sec): 8.67 - samples/sec: 2239.26 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:09:50,933 epoch 10 - iter 216/242 - loss 0.01112444 - time (sec): 9.76 - samples/sec: 2245.84 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:09:52,058 epoch 10 - iter 240/242 - loss 0.01028077 - time (sec): 10.88 - samples/sec: 2257.48 - lr: 0.000000 - momentum: 0.000000 2023-10-13 11:09:52,141 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:52,142 EPOCH 10 done: loss 0.0102 - lr: 0.000000 2023-10-13 11:09:52,913 DEV : loss 0.19032004475593567 - f1-score (micro avg) 0.8468 2023-10-13 11:09:53,275 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:09:53,276 Loading model from best epoch ... 2023-10-13 11:09:54,656 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 11:09:55,491 Results: - F-score (micro) 0.8235 - F-score (macro) 0.5974 - Accuracy 0.7146 By class: precision recall f1-score support pers 0.8581 0.9137 0.8850 139 scope 0.8156 0.8915 0.8519 129 work 0.6667 0.7750 0.7168 80 loc 0.6667 0.4444 0.5333 9 date 0.0000 0.0000 0.0000 3 micro avg 0.7938 0.8556 0.8235 360 macro avg 0.6014 0.6049 0.5974 360 weighted avg 0.7884 0.8556 0.8196 360 2023-10-13 11:09:55,491 ----------------------------------------------------------------------------------------------------