2023-10-13 10:59:08,250 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,251 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 10:59:08,251 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,251 MultiCorpus: 966 train + 219 dev + 204 test sentences - NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator 2023-10-13 10:59:08,251 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,251 Train: 966 sentences 2023-10-13 10:59:08,251 (train_with_dev=False, train_with_test=False) 2023-10-13 10:59:08,251 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,251 Training Params: 2023-10-13 10:59:08,251 - learning_rate: "5e-05" 2023-10-13 10:59:08,251 - mini_batch_size: "4" 2023-10-13 10:59:08,251 - max_epochs: "10" 2023-10-13 10:59:08,251 - shuffle: "True" 2023-10-13 10:59:08,251 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,251 Plugins: 2023-10-13 10:59:08,252 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 10:59:08,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,252 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 10:59:08,252 - metric: "('micro avg', 'f1-score')" 2023-10-13 10:59:08,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,252 Computation: 2023-10-13 10:59:08,252 - compute on device: cuda:0 2023-10-13 10:59:08,252 - embedding storage: none 2023-10-13 10:59:08,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,252 Model training base path: "hmbench-ajmc/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 10:59:08,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:08,252 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:09,306 epoch 1 - iter 24/242 - loss 3.28733292 - time (sec): 1.05 - samples/sec: 2366.78 - lr: 0.000005 - momentum: 0.000000 2023-10-13 10:59:10,392 epoch 1 - iter 48/242 - loss 2.63402043 - time (sec): 2.14 - samples/sec: 2326.05 - lr: 0.000010 - momentum: 0.000000 2023-10-13 10:59:11,459 epoch 1 - iter 72/242 - loss 2.00586654 - time (sec): 3.21 - samples/sec: 2324.24 - lr: 0.000015 - momentum: 0.000000 2023-10-13 10:59:12,508 epoch 1 - iter 96/242 - loss 1.68954188 - time (sec): 4.26 - samples/sec: 2294.28 - lr: 0.000020 - momentum: 0.000000 2023-10-13 10:59:13,588 epoch 1 - iter 120/242 - loss 1.45281702 - time (sec): 5.33 - samples/sec: 2263.62 - lr: 0.000025 - momentum: 0.000000 2023-10-13 10:59:14,716 epoch 1 - iter 144/242 - loss 1.26827468 - time (sec): 6.46 - samples/sec: 2252.45 - lr: 0.000030 - momentum: 0.000000 2023-10-13 10:59:15,809 epoch 1 - iter 168/242 - loss 1.11884816 - time (sec): 7.56 - samples/sec: 2283.96 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:59:16,875 epoch 1 - iter 192/242 - loss 1.01409299 - time (sec): 8.62 - samples/sec: 2294.74 - lr: 0.000039 - momentum: 0.000000 2023-10-13 10:59:17,918 epoch 1 - iter 216/242 - loss 0.94260177 - time (sec): 9.67 - samples/sec: 2287.57 - lr: 0.000044 - momentum: 0.000000 2023-10-13 10:59:19,051 epoch 1 - iter 240/242 - loss 0.87025608 - time (sec): 10.80 - samples/sec: 2278.82 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:59:19,134 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:19,135 EPOCH 1 done: loss 0.8660 - lr: 0.000049 2023-10-13 10:59:19,874 DEV : loss 0.21214990317821503 - f1-score (micro avg) 0.5594 2023-10-13 10:59:19,879 saving best model 2023-10-13 10:59:20,221 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:21,273 epoch 2 - iter 24/242 - loss 0.22624820 - time (sec): 1.05 - samples/sec: 2433.95 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:59:22,350 epoch 2 - iter 48/242 - loss 0.21233454 - time (sec): 2.13 - samples/sec: 2375.85 - lr: 0.000049 - momentum: 0.000000 2023-10-13 10:59:23,396 epoch 2 - iter 72/242 - loss 0.18839820 - time (sec): 3.17 - samples/sec: 2339.79 - lr: 0.000048 - momentum: 0.000000 2023-10-13 10:59:24,448 epoch 2 - iter 96/242 - loss 0.17271089 - time (sec): 4.22 - samples/sec: 2330.52 - lr: 0.000048 - momentum: 0.000000 2023-10-13 10:59:25,552 epoch 2 - iter 120/242 - loss 0.18913709 - time (sec): 5.33 - samples/sec: 2356.25 - lr: 0.000047 - momentum: 0.000000 2023-10-13 10:59:26,636 epoch 2 - iter 144/242 - loss 0.17833993 - time (sec): 6.41 - samples/sec: 2388.36 - lr: 0.000047 - momentum: 0.000000 2023-10-13 10:59:27,689 epoch 2 - iter 168/242 - loss 0.17433724 - time (sec): 7.47 - samples/sec: 2363.68 - lr: 0.000046 - momentum: 0.000000 2023-10-13 10:59:28,762 epoch 2 - iter 192/242 - loss 0.17874942 - time (sec): 8.54 - samples/sec: 2321.69 - lr: 0.000046 - momentum: 0.000000 2023-10-13 10:59:29,824 epoch 2 - iter 216/242 - loss 0.18049272 - time (sec): 9.60 - samples/sec: 2297.49 - lr: 0.000045 - momentum: 0.000000 2023-10-13 10:59:30,905 epoch 2 - iter 240/242 - loss 0.17478721 - time (sec): 10.68 - samples/sec: 2301.40 - lr: 0.000045 - momentum: 0.000000 2023-10-13 10:59:30,990 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:30,991 EPOCH 2 done: loss 0.1742 - lr: 0.000045 2023-10-13 10:59:31,852 DEV : loss 0.12608306109905243 - f1-score (micro avg) 0.8045 2023-10-13 10:59:31,861 saving best model 2023-10-13 10:59:32,380 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:33,707 epoch 3 - iter 24/242 - loss 0.09298301 - time (sec): 1.32 - samples/sec: 1779.37 - lr: 0.000044 - momentum: 0.000000 2023-10-13 10:59:34,962 epoch 3 - iter 48/242 - loss 0.10636551 - time (sec): 2.58 - samples/sec: 1926.00 - lr: 0.000043 - momentum: 0.000000 2023-10-13 10:59:36,208 epoch 3 - iter 72/242 - loss 0.10343864 - time (sec): 3.83 - samples/sec: 1932.86 - lr: 0.000043 - momentum: 0.000000 2023-10-13 10:59:37,481 epoch 3 - iter 96/242 - loss 0.10112269 - time (sec): 5.10 - samples/sec: 1923.61 - lr: 0.000042 - momentum: 0.000000 2023-10-13 10:59:38,690 epoch 3 - iter 120/242 - loss 0.10150347 - time (sec): 6.31 - samples/sec: 1929.61 - lr: 0.000042 - momentum: 0.000000 2023-10-13 10:59:40,038 epoch 3 - iter 144/242 - loss 0.10549571 - time (sec): 7.66 - samples/sec: 1952.93 - lr: 0.000041 - momentum: 0.000000 2023-10-13 10:59:41,356 epoch 3 - iter 168/242 - loss 0.10215889 - time (sec): 8.97 - samples/sec: 1917.90 - lr: 0.000041 - momentum: 0.000000 2023-10-13 10:59:42,677 epoch 3 - iter 192/242 - loss 0.10211715 - time (sec): 10.29 - samples/sec: 1910.10 - lr: 0.000040 - momentum: 0.000000 2023-10-13 10:59:44,060 epoch 3 - iter 216/242 - loss 0.09881486 - time (sec): 11.68 - samples/sec: 1888.38 - lr: 0.000040 - momentum: 0.000000 2023-10-13 10:59:45,285 epoch 3 - iter 240/242 - loss 0.09977052 - time (sec): 12.90 - samples/sec: 1910.68 - lr: 0.000039 - momentum: 0.000000 2023-10-13 10:59:45,381 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:45,381 EPOCH 3 done: loss 0.0996 - lr: 0.000039 2023-10-13 10:59:46,225 DEV : loss 0.1327812224626541 - f1-score (micro avg) 0.8134 2023-10-13 10:59:46,230 saving best model 2023-10-13 10:59:46,704 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:47,820 epoch 4 - iter 24/242 - loss 0.04194900 - time (sec): 1.11 - samples/sec: 2245.65 - lr: 0.000038 - momentum: 0.000000 2023-10-13 10:59:48,933 epoch 4 - iter 48/242 - loss 0.06746592 - time (sec): 2.23 - samples/sec: 2269.70 - lr: 0.000038 - momentum: 0.000000 2023-10-13 10:59:50,079 epoch 4 - iter 72/242 - loss 0.07175249 - time (sec): 3.37 - samples/sec: 2327.56 - lr: 0.000037 - momentum: 0.000000 2023-10-13 10:59:51,181 epoch 4 - iter 96/242 - loss 0.07971414 - time (sec): 4.48 - samples/sec: 2280.25 - lr: 0.000037 - momentum: 0.000000 2023-10-13 10:59:52,322 epoch 4 - iter 120/242 - loss 0.07312796 - time (sec): 5.62 - samples/sec: 2279.12 - lr: 0.000036 - momentum: 0.000000 2023-10-13 10:59:53,455 epoch 4 - iter 144/242 - loss 0.07512113 - time (sec): 6.75 - samples/sec: 2270.23 - lr: 0.000036 - momentum: 0.000000 2023-10-13 10:59:54,569 epoch 4 - iter 168/242 - loss 0.07531151 - time (sec): 7.86 - samples/sec: 2286.32 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:59:55,639 epoch 4 - iter 192/242 - loss 0.07672808 - time (sec): 8.93 - samples/sec: 2230.72 - lr: 0.000035 - momentum: 0.000000 2023-10-13 10:59:56,705 epoch 4 - iter 216/242 - loss 0.07417428 - time (sec): 10.00 - samples/sec: 2234.15 - lr: 0.000034 - momentum: 0.000000 2023-10-13 10:59:57,756 epoch 4 - iter 240/242 - loss 0.07268832 - time (sec): 11.05 - samples/sec: 2231.23 - lr: 0.000033 - momentum: 0.000000 2023-10-13 10:59:57,843 ---------------------------------------------------------------------------------------------------- 2023-10-13 10:59:57,843 EPOCH 4 done: loss 0.0724 - lr: 0.000033 2023-10-13 10:59:58,635 DEV : loss 0.1568627655506134 - f1-score (micro avg) 0.8312 2023-10-13 10:59:58,640 saving best model 2023-10-13 10:59:59,110 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:00,176 epoch 5 - iter 24/242 - loss 0.03483374 - time (sec): 1.06 - samples/sec: 1997.81 - lr: 0.000033 - momentum: 0.000000 2023-10-13 11:00:01,249 epoch 5 - iter 48/242 - loss 0.04413526 - time (sec): 2.13 - samples/sec: 2249.23 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:00:02,337 epoch 5 - iter 72/242 - loss 0.04976873 - time (sec): 3.22 - samples/sec: 2226.61 - lr: 0.000032 - momentum: 0.000000 2023-10-13 11:00:03,426 epoch 5 - iter 96/242 - loss 0.04982655 - time (sec): 4.31 - samples/sec: 2197.10 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:00:04,523 epoch 5 - iter 120/242 - loss 0.05222925 - time (sec): 5.40 - samples/sec: 2230.99 - lr: 0.000031 - momentum: 0.000000 2023-10-13 11:00:05,620 epoch 5 - iter 144/242 - loss 0.05547688 - time (sec): 6.50 - samples/sec: 2240.20 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:00:06,710 epoch 5 - iter 168/242 - loss 0.05574000 - time (sec): 7.59 - samples/sec: 2275.39 - lr: 0.000030 - momentum: 0.000000 2023-10-13 11:00:07,768 epoch 5 - iter 192/242 - loss 0.05515866 - time (sec): 8.65 - samples/sec: 2267.04 - lr: 0.000029 - momentum: 0.000000 2023-10-13 11:00:08,874 epoch 5 - iter 216/242 - loss 0.05441600 - time (sec): 9.75 - samples/sec: 2271.25 - lr: 0.000028 - momentum: 0.000000 2023-10-13 11:00:09,967 epoch 5 - iter 240/242 - loss 0.05441985 - time (sec): 10.85 - samples/sec: 2271.98 - lr: 0.000028 - momentum: 0.000000 2023-10-13 11:00:10,060 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:10,060 EPOCH 5 done: loss 0.0542 - lr: 0.000028 2023-10-13 11:00:10,812 DEV : loss 0.1912143975496292 - f1-score (micro avg) 0.8287 2023-10-13 11:00:10,817 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:11,938 epoch 6 - iter 24/242 - loss 0.05283667 - time (sec): 1.12 - samples/sec: 2277.81 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:00:13,012 epoch 6 - iter 48/242 - loss 0.03613276 - time (sec): 2.19 - samples/sec: 2198.78 - lr: 0.000027 - momentum: 0.000000 2023-10-13 11:00:14,098 epoch 6 - iter 72/242 - loss 0.03729743 - time (sec): 3.28 - samples/sec: 2240.13 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:00:15,232 epoch 6 - iter 96/242 - loss 0.03520942 - time (sec): 4.41 - samples/sec: 2228.42 - lr: 0.000026 - momentum: 0.000000 2023-10-13 11:00:16,386 epoch 6 - iter 120/242 - loss 0.03242642 - time (sec): 5.57 - samples/sec: 2253.95 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:00:17,478 epoch 6 - iter 144/242 - loss 0.03908906 - time (sec): 6.66 - samples/sec: 2229.69 - lr: 0.000025 - momentum: 0.000000 2023-10-13 11:00:18,575 epoch 6 - iter 168/242 - loss 0.03867449 - time (sec): 7.76 - samples/sec: 2214.95 - lr: 0.000024 - momentum: 0.000000 2023-10-13 11:00:19,656 epoch 6 - iter 192/242 - loss 0.03996907 - time (sec): 8.84 - samples/sec: 2236.10 - lr: 0.000023 - momentum: 0.000000 2023-10-13 11:00:20,703 epoch 6 - iter 216/242 - loss 0.04038098 - time (sec): 9.88 - samples/sec: 2239.86 - lr: 0.000023 - momentum: 0.000000 2023-10-13 11:00:21,779 epoch 6 - iter 240/242 - loss 0.03904930 - time (sec): 10.96 - samples/sec: 2231.07 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:00:21,872 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:21,872 EPOCH 6 done: loss 0.0386 - lr: 0.000022 2023-10-13 11:00:22,704 DEV : loss 0.18979744613170624 - f1-score (micro avg) 0.8401 2023-10-13 11:00:22,710 saving best model 2023-10-13 11:00:23,179 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:24,347 epoch 7 - iter 24/242 - loss 0.01995235 - time (sec): 1.16 - samples/sec: 2138.90 - lr: 0.000022 - momentum: 0.000000 2023-10-13 11:00:25,425 epoch 7 - iter 48/242 - loss 0.01639084 - time (sec): 2.24 - samples/sec: 2054.26 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:00:26,499 epoch 7 - iter 72/242 - loss 0.01766459 - time (sec): 3.31 - samples/sec: 2147.70 - lr: 0.000021 - momentum: 0.000000 2023-10-13 11:00:27,634 epoch 7 - iter 96/242 - loss 0.02077422 - time (sec): 4.45 - samples/sec: 2208.62 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:00:28,711 epoch 7 - iter 120/242 - loss 0.02499053 - time (sec): 5.53 - samples/sec: 2172.53 - lr: 0.000020 - momentum: 0.000000 2023-10-13 11:00:29,804 epoch 7 - iter 144/242 - loss 0.02688253 - time (sec): 6.62 - samples/sec: 2185.03 - lr: 0.000019 - momentum: 0.000000 2023-10-13 11:00:30,915 epoch 7 - iter 168/242 - loss 0.02690714 - time (sec): 7.73 - samples/sec: 2193.29 - lr: 0.000018 - momentum: 0.000000 2023-10-13 11:00:32,036 epoch 7 - iter 192/242 - loss 0.02722919 - time (sec): 8.85 - samples/sec: 2215.68 - lr: 0.000018 - momentum: 0.000000 2023-10-13 11:00:33,112 epoch 7 - iter 216/242 - loss 0.02576495 - time (sec): 9.93 - samples/sec: 2231.01 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:00:34,219 epoch 7 - iter 240/242 - loss 0.02602469 - time (sec): 11.03 - samples/sec: 2228.97 - lr: 0.000017 - momentum: 0.000000 2023-10-13 11:00:34,306 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:34,306 EPOCH 7 done: loss 0.0259 - lr: 0.000017 2023-10-13 11:00:35,271 DEV : loss 0.1991121768951416 - f1-score (micro avg) 0.8315 2023-10-13 11:00:35,277 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:36,352 epoch 8 - iter 24/242 - loss 0.01398267 - time (sec): 1.07 - samples/sec: 2428.11 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:00:37,410 epoch 8 - iter 48/242 - loss 0.02187050 - time (sec): 2.13 - samples/sec: 2261.18 - lr: 0.000016 - momentum: 0.000000 2023-10-13 11:00:38,497 epoch 8 - iter 72/242 - loss 0.02478786 - time (sec): 3.22 - samples/sec: 2316.34 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:00:39,565 epoch 8 - iter 96/242 - loss 0.02290241 - time (sec): 4.29 - samples/sec: 2350.71 - lr: 0.000015 - momentum: 0.000000 2023-10-13 11:00:40,644 epoch 8 - iter 120/242 - loss 0.01921730 - time (sec): 5.37 - samples/sec: 2302.90 - lr: 0.000014 - momentum: 0.000000 2023-10-13 11:00:41,758 epoch 8 - iter 144/242 - loss 0.01872496 - time (sec): 6.48 - samples/sec: 2327.98 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:00:42,819 epoch 8 - iter 168/242 - loss 0.01731024 - time (sec): 7.54 - samples/sec: 2299.72 - lr: 0.000013 - momentum: 0.000000 2023-10-13 11:00:43,890 epoch 8 - iter 192/242 - loss 0.01778436 - time (sec): 8.61 - samples/sec: 2301.69 - lr: 0.000012 - momentum: 0.000000 2023-10-13 11:00:44,962 epoch 8 - iter 216/242 - loss 0.01877575 - time (sec): 9.68 - samples/sec: 2299.60 - lr: 0.000012 - momentum: 0.000000 2023-10-13 11:00:46,026 epoch 8 - iter 240/242 - loss 0.01890326 - time (sec): 10.75 - samples/sec: 2292.00 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:00:46,109 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:46,109 EPOCH 8 done: loss 0.0188 - lr: 0.000011 2023-10-13 11:00:46,955 DEV : loss 0.18604768812656403 - f1-score (micro avg) 0.8455 2023-10-13 11:00:46,960 saving best model 2023-10-13 11:00:47,419 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:48,560 epoch 9 - iter 24/242 - loss 0.02467621 - time (sec): 1.14 - samples/sec: 1974.79 - lr: 0.000011 - momentum: 0.000000 2023-10-13 11:00:49,631 epoch 9 - iter 48/242 - loss 0.01478693 - time (sec): 2.21 - samples/sec: 2156.92 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:00:50,751 epoch 9 - iter 72/242 - loss 0.01735104 - time (sec): 3.33 - samples/sec: 2179.58 - lr: 0.000010 - momentum: 0.000000 2023-10-13 11:00:51,852 epoch 9 - iter 96/242 - loss 0.01333860 - time (sec): 4.43 - samples/sec: 2240.98 - lr: 0.000009 - momentum: 0.000000 2023-10-13 11:00:52,943 epoch 9 - iter 120/242 - loss 0.01223139 - time (sec): 5.52 - samples/sec: 2226.98 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:00:54,061 epoch 9 - iter 144/242 - loss 0.01157016 - time (sec): 6.64 - samples/sec: 2213.05 - lr: 0.000008 - momentum: 0.000000 2023-10-13 11:00:55,220 epoch 9 - iter 168/242 - loss 0.01108253 - time (sec): 7.80 - samples/sec: 2215.59 - lr: 0.000007 - momentum: 0.000000 2023-10-13 11:00:56,375 epoch 9 - iter 192/242 - loss 0.01061662 - time (sec): 8.95 - samples/sec: 2224.12 - lr: 0.000007 - momentum: 0.000000 2023-10-13 11:00:57,494 epoch 9 - iter 216/242 - loss 0.01072204 - time (sec): 10.07 - samples/sec: 2199.31 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:00:58,615 epoch 9 - iter 240/242 - loss 0.01115647 - time (sec): 11.19 - samples/sec: 2194.00 - lr: 0.000006 - momentum: 0.000000 2023-10-13 11:00:58,699 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:00:58,700 EPOCH 9 done: loss 0.0111 - lr: 0.000006 2023-10-13 11:00:59,516 DEV : loss 0.1872359663248062 - f1-score (micro avg) 0.8451 2023-10-13 11:00:59,522 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:01:00,652 epoch 10 - iter 24/242 - loss 0.00276810 - time (sec): 1.13 - samples/sec: 2006.47 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:01:01,740 epoch 10 - iter 48/242 - loss 0.00426126 - time (sec): 2.22 - samples/sec: 2154.76 - lr: 0.000005 - momentum: 0.000000 2023-10-13 11:01:02,891 epoch 10 - iter 72/242 - loss 0.00479354 - time (sec): 3.37 - samples/sec: 2183.96 - lr: 0.000004 - momentum: 0.000000 2023-10-13 11:01:04,056 epoch 10 - iter 96/242 - loss 0.01271530 - time (sec): 4.53 - samples/sec: 2172.12 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:01:05,129 epoch 10 - iter 120/242 - loss 0.01214986 - time (sec): 5.61 - samples/sec: 2153.56 - lr: 0.000003 - momentum: 0.000000 2023-10-13 11:01:06,195 epoch 10 - iter 144/242 - loss 0.01140887 - time (sec): 6.67 - samples/sec: 2108.49 - lr: 0.000002 - momentum: 0.000000 2023-10-13 11:01:07,303 epoch 10 - iter 168/242 - loss 0.00972095 - time (sec): 7.78 - samples/sec: 2146.63 - lr: 0.000002 - momentum: 0.000000 2023-10-13 11:01:08,402 epoch 10 - iter 192/242 - loss 0.00946718 - time (sec): 8.88 - samples/sec: 2155.32 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:01:09,485 epoch 10 - iter 216/242 - loss 0.00880560 - time (sec): 9.96 - samples/sec: 2184.60 - lr: 0.000001 - momentum: 0.000000 2023-10-13 11:01:10,583 epoch 10 - iter 240/242 - loss 0.00862072 - time (sec): 11.06 - samples/sec: 2219.90 - lr: 0.000000 - momentum: 0.000000 2023-10-13 11:01:10,668 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:01:10,668 EPOCH 10 done: loss 0.0086 - lr: 0.000000 2023-10-13 11:01:11,517 DEV : loss 0.19478590786457062 - f1-score (micro avg) 0.835 2023-10-13 11:01:11,880 ---------------------------------------------------------------------------------------------------- 2023-10-13 11:01:11,881 Loading model from best epoch ... 2023-10-13 11:01:13,251 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 11:01:14,070 Results: - F-score (micro) 0.8232 - F-score (macro) 0.4943 - Accuracy 0.7193 By class: precision recall f1-score support pers 0.8611 0.8921 0.8763 139 scope 0.8321 0.8837 0.8571 129 work 0.6848 0.7875 0.7326 80 loc 0.5714 0.4444 0.5000 9 date 0.0000 0.0000 0.0000 3 object 0.0000 0.0000 0.0000 0 micro avg 0.8005 0.8472 0.8232 360 macro avg 0.4916 0.5013 0.4943 360 weighted avg 0.7971 0.8472 0.8208 360 2023-10-13 11:01:14,070 ----------------------------------------------------------------------------------------------------