|
2023-10-16 18:03:29,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,369 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,370 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,370 Train: 1166 sentences |
|
2023-10-16 18:03:29,370 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,370 Training Params: |
|
2023-10-16 18:03:29,370 - learning_rate: "5e-05" |
|
2023-10-16 18:03:29,370 - mini_batch_size: "8" |
|
2023-10-16 18:03:29,370 - max_epochs: "10" |
|
2023-10-16 18:03:29,370 - shuffle: "True" |
|
2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,370 Plugins: |
|
2023-10-16 18:03:29,370 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 18:03:29,370 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,370 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 18:03:29,371 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,371 Computation: |
|
2023-10-16 18:03:29,371 - compute on device: cuda:0 |
|
2023-10-16 18:03:29,371 - embedding storage: none |
|
2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,371 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:29,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:30,832 epoch 1 - iter 14/146 - loss 3.01497338 - time (sec): 1.46 - samples/sec: 2840.69 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:03:32,586 epoch 1 - iter 28/146 - loss 2.66432612 - time (sec): 3.21 - samples/sec: 2976.51 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:03:33,881 epoch 1 - iter 42/146 - loss 2.22017613 - time (sec): 4.51 - samples/sec: 2961.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:03:35,066 epoch 1 - iter 56/146 - loss 1.88824002 - time (sec): 5.69 - samples/sec: 3029.64 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:03:36,628 epoch 1 - iter 70/146 - loss 1.59845017 - time (sec): 7.26 - samples/sec: 3000.46 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:03:38,129 epoch 1 - iter 84/146 - loss 1.41561660 - time (sec): 8.76 - samples/sec: 2977.65 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:03:39,479 epoch 1 - iter 98/146 - loss 1.31091327 - time (sec): 10.11 - samples/sec: 3005.20 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:03:40,770 epoch 1 - iter 112/146 - loss 1.19566679 - time (sec): 11.40 - samples/sec: 3019.72 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:03:42,096 epoch 1 - iter 126/146 - loss 1.10677521 - time (sec): 12.72 - samples/sec: 2998.40 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:03:43,586 epoch 1 - iter 140/146 - loss 1.01852525 - time (sec): 14.21 - samples/sec: 3008.21 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:03:44,182 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:44,183 EPOCH 1 done: loss 0.9945 - lr: 0.000048 |
|
2023-10-16 18:03:45,011 DEV : loss 0.21267659962177277 - f1-score (micro avg) 0.4215 |
|
2023-10-16 18:03:45,017 saving best model |
|
2023-10-16 18:03:45,421 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:03:47,141 epoch 2 - iter 14/146 - loss 0.19846158 - time (sec): 1.72 - samples/sec: 2526.85 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-16 18:03:48,426 epoch 2 - iter 28/146 - loss 0.25337461 - time (sec): 3.00 - samples/sec: 2817.46 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-16 18:03:49,670 epoch 2 - iter 42/146 - loss 0.24819521 - time (sec): 4.25 - samples/sec: 3003.01 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:03:51,044 epoch 2 - iter 56/146 - loss 0.22898805 - time (sec): 5.62 - samples/sec: 2998.40 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-16 18:03:52,231 epoch 2 - iter 70/146 - loss 0.21737529 - time (sec): 6.81 - samples/sec: 2987.47 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:03:53,641 epoch 2 - iter 84/146 - loss 0.21818368 - time (sec): 8.22 - samples/sec: 2953.87 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-16 18:03:55,531 epoch 2 - iter 98/146 - loss 0.21871720 - time (sec): 10.11 - samples/sec: 2914.81 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:03:57,179 epoch 2 - iter 112/146 - loss 0.21468634 - time (sec): 11.76 - samples/sec: 2886.73 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-16 18:03:58,761 epoch 2 - iter 126/146 - loss 0.21766037 - time (sec): 13.34 - samples/sec: 2867.10 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:04:00,342 epoch 2 - iter 140/146 - loss 0.20901073 - time (sec): 14.92 - samples/sec: 2891.15 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-16 18:04:00,765 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:00,765 EPOCH 2 done: loss 0.2086 - lr: 0.000045 |
|
2023-10-16 18:04:02,072 DEV : loss 0.1228543296456337 - f1-score (micro avg) 0.6842 |
|
2023-10-16 18:04:02,078 saving best model |
|
2023-10-16 18:04:02,611 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:04,294 epoch 3 - iter 14/146 - loss 0.14117504 - time (sec): 1.68 - samples/sec: 3205.04 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-16 18:04:05,541 epoch 3 - iter 28/146 - loss 0.12690368 - time (sec): 2.93 - samples/sec: 2907.39 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:04:06,769 epoch 3 - iter 42/146 - loss 0.13116279 - time (sec): 4.16 - samples/sec: 3099.71 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-16 18:04:08,164 epoch 3 - iter 56/146 - loss 0.12578848 - time (sec): 5.55 - samples/sec: 3162.13 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:04:09,644 epoch 3 - iter 70/146 - loss 0.11850282 - time (sec): 7.03 - samples/sec: 3196.95 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-16 18:04:11,025 epoch 3 - iter 84/146 - loss 0.11666530 - time (sec): 8.41 - samples/sec: 3173.50 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:04:12,066 epoch 3 - iter 98/146 - loss 0.11535129 - time (sec): 9.45 - samples/sec: 3129.27 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-16 18:04:13,501 epoch 3 - iter 112/146 - loss 0.11610264 - time (sec): 10.89 - samples/sec: 3096.61 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:04:14,912 epoch 3 - iter 126/146 - loss 0.11279962 - time (sec): 12.30 - samples/sec: 3085.89 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-16 18:04:16,435 epoch 3 - iter 140/146 - loss 0.11327376 - time (sec): 13.82 - samples/sec: 3090.83 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-16 18:04:17,059 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:17,059 EPOCH 3 done: loss 0.1124 - lr: 0.000039 |
|
2023-10-16 18:04:18,296 DEV : loss 0.11312269419431686 - f1-score (micro avg) 0.7093 |
|
2023-10-16 18:04:18,300 saving best model |
|
2023-10-16 18:04:18,736 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:20,303 epoch 4 - iter 14/146 - loss 0.07271600 - time (sec): 1.56 - samples/sec: 2859.63 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:04:21,947 epoch 4 - iter 28/146 - loss 0.08092845 - time (sec): 3.21 - samples/sec: 2813.11 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-16 18:04:23,283 epoch 4 - iter 42/146 - loss 0.07155158 - time (sec): 4.54 - samples/sec: 2848.06 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:04:24,679 epoch 4 - iter 56/146 - loss 0.06922215 - time (sec): 5.94 - samples/sec: 2916.61 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-16 18:04:26,059 epoch 4 - iter 70/146 - loss 0.07262218 - time (sec): 7.32 - samples/sec: 2935.48 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:04:27,517 epoch 4 - iter 84/146 - loss 0.07370568 - time (sec): 8.78 - samples/sec: 2884.56 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-16 18:04:28,799 epoch 4 - iter 98/146 - loss 0.07320739 - time (sec): 10.06 - samples/sec: 2889.11 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:04:30,342 epoch 4 - iter 112/146 - loss 0.07873008 - time (sec): 11.60 - samples/sec: 2889.88 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-16 18:04:31,904 epoch 4 - iter 126/146 - loss 0.07512300 - time (sec): 13.16 - samples/sec: 2927.05 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:04:33,248 epoch 4 - iter 140/146 - loss 0.07444114 - time (sec): 14.51 - samples/sec: 2929.89 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-16 18:04:33,902 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:33,902 EPOCH 4 done: loss 0.0751 - lr: 0.000034 |
|
2023-10-16 18:04:35,153 DEV : loss 0.10812485218048096 - f1-score (micro avg) 0.7446 |
|
2023-10-16 18:04:35,158 saving best model |
|
2023-10-16 18:04:35,666 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:37,282 epoch 5 - iter 14/146 - loss 0.05012667 - time (sec): 1.61 - samples/sec: 2978.46 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-16 18:04:38,724 epoch 5 - iter 28/146 - loss 0.05362344 - time (sec): 3.05 - samples/sec: 2896.15 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:04:39,994 epoch 5 - iter 42/146 - loss 0.05602592 - time (sec): 4.32 - samples/sec: 3049.46 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-16 18:04:41,436 epoch 5 - iter 56/146 - loss 0.05228982 - time (sec): 5.76 - samples/sec: 3067.34 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:04:43,165 epoch 5 - iter 70/146 - loss 0.05023073 - time (sec): 7.49 - samples/sec: 2978.39 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-16 18:04:44,560 epoch 5 - iter 84/146 - loss 0.05370307 - time (sec): 8.89 - samples/sec: 2991.28 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:04:45,946 epoch 5 - iter 98/146 - loss 0.05217695 - time (sec): 10.28 - samples/sec: 3019.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 18:04:47,101 epoch 5 - iter 112/146 - loss 0.05378885 - time (sec): 11.43 - samples/sec: 3022.07 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:04:48,507 epoch 5 - iter 126/146 - loss 0.05094919 - time (sec): 12.84 - samples/sec: 3017.31 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 18:04:49,841 epoch 5 - iter 140/146 - loss 0.05093383 - time (sec): 14.17 - samples/sec: 3019.63 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 18:04:50,475 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:50,475 EPOCH 5 done: loss 0.0505 - lr: 0.000028 |
|
2023-10-16 18:04:51,717 DEV : loss 0.12893062829971313 - f1-score (micro avg) 0.7451 |
|
2023-10-16 18:04:51,722 saving best model |
|
2023-10-16 18:04:52,219 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:04:53,686 epoch 6 - iter 14/146 - loss 0.05350656 - time (sec): 1.46 - samples/sec: 2785.60 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:04:55,172 epoch 6 - iter 28/146 - loss 0.04160296 - time (sec): 2.95 - samples/sec: 2864.73 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 18:04:56,281 epoch 6 - iter 42/146 - loss 0.04397600 - time (sec): 4.06 - samples/sec: 2902.44 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:04:57,844 epoch 6 - iter 56/146 - loss 0.04078383 - time (sec): 5.62 - samples/sec: 2960.99 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 18:04:58,952 epoch 6 - iter 70/146 - loss 0.03860104 - time (sec): 6.73 - samples/sec: 2941.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:05:00,612 epoch 6 - iter 84/146 - loss 0.03725238 - time (sec): 8.39 - samples/sec: 2888.25 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 18:05:02,296 epoch 6 - iter 98/146 - loss 0.03930290 - time (sec): 10.07 - samples/sec: 2919.23 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:05:03,668 epoch 6 - iter 112/146 - loss 0.03821762 - time (sec): 11.44 - samples/sec: 2953.46 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 18:05:05,152 epoch 6 - iter 126/146 - loss 0.03812065 - time (sec): 12.93 - samples/sec: 2955.70 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:05:06,486 epoch 6 - iter 140/146 - loss 0.03699667 - time (sec): 14.26 - samples/sec: 2988.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 18:05:07,036 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:07,036 EPOCH 6 done: loss 0.0362 - lr: 0.000023 |
|
2023-10-16 18:05:08,266 DEV : loss 0.12390300631523132 - f1-score (micro avg) 0.7414 |
|
2023-10-16 18:05:08,271 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:09,553 epoch 7 - iter 14/146 - loss 0.01743334 - time (sec): 1.28 - samples/sec: 3239.48 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 18:05:10,816 epoch 7 - iter 28/146 - loss 0.01770293 - time (sec): 2.54 - samples/sec: 3207.44 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:05:12,064 epoch 7 - iter 42/146 - loss 0.03132041 - time (sec): 3.79 - samples/sec: 3161.58 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 18:05:13,842 epoch 7 - iter 56/146 - loss 0.03031072 - time (sec): 5.57 - samples/sec: 3063.29 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:05:15,512 epoch 7 - iter 70/146 - loss 0.03222962 - time (sec): 7.24 - samples/sec: 2966.48 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 18:05:17,073 epoch 7 - iter 84/146 - loss 0.03034968 - time (sec): 8.80 - samples/sec: 2895.56 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:05:18,505 epoch 7 - iter 98/146 - loss 0.02880115 - time (sec): 10.23 - samples/sec: 2895.60 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 18:05:19,805 epoch 7 - iter 112/146 - loss 0.02780285 - time (sec): 11.53 - samples/sec: 2939.47 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:05:21,415 epoch 7 - iter 126/146 - loss 0.02885371 - time (sec): 13.14 - samples/sec: 2937.77 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 18:05:22,691 epoch 7 - iter 140/146 - loss 0.02990561 - time (sec): 14.42 - samples/sec: 2952.74 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 18:05:23,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:23,307 EPOCH 7 done: loss 0.0292 - lr: 0.000017 |
|
2023-10-16 18:05:24,837 DEV : loss 0.13431765139102936 - f1-score (micro avg) 0.7837 |
|
2023-10-16 18:05:24,843 saving best model |
|
2023-10-16 18:05:25,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:26,644 epoch 8 - iter 14/146 - loss 0.01998228 - time (sec): 1.26 - samples/sec: 2827.62 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:05:28,217 epoch 8 - iter 28/146 - loss 0.01806968 - time (sec): 2.83 - samples/sec: 2953.41 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 18:05:29,581 epoch 8 - iter 42/146 - loss 0.01593014 - time (sec): 4.20 - samples/sec: 2977.35 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:05:31,066 epoch 8 - iter 56/146 - loss 0.01649775 - time (sec): 5.68 - samples/sec: 2983.75 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 18:05:32,577 epoch 8 - iter 70/146 - loss 0.01690638 - time (sec): 7.19 - samples/sec: 2953.54 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:05:34,016 epoch 8 - iter 84/146 - loss 0.01728427 - time (sec): 8.63 - samples/sec: 2931.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 18:05:35,351 epoch 8 - iter 98/146 - loss 0.01793460 - time (sec): 9.97 - samples/sec: 2912.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:05:36,764 epoch 8 - iter 112/146 - loss 0.02025946 - time (sec): 11.38 - samples/sec: 2919.80 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 18:05:37,985 epoch 8 - iter 126/146 - loss 0.02049963 - time (sec): 12.60 - samples/sec: 2942.14 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:05:39,695 epoch 8 - iter 140/146 - loss 0.02027892 - time (sec): 14.31 - samples/sec: 2967.94 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 18:05:40,439 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:40,439 EPOCH 8 done: loss 0.0198 - lr: 0.000012 |
|
2023-10-16 18:05:41,765 DEV : loss 0.15437102317810059 - f1-score (micro avg) 0.7479 |
|
2023-10-16 18:05:41,774 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:43,312 epoch 9 - iter 14/146 - loss 0.01195428 - time (sec): 1.54 - samples/sec: 2860.41 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 18:05:44,705 epoch 9 - iter 28/146 - loss 0.00962441 - time (sec): 2.93 - samples/sec: 2836.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:05:46,047 epoch 9 - iter 42/146 - loss 0.01059304 - time (sec): 4.27 - samples/sec: 2842.07 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 18:05:47,676 epoch 9 - iter 56/146 - loss 0.01847025 - time (sec): 5.90 - samples/sec: 2896.60 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:05:49,125 epoch 9 - iter 70/146 - loss 0.01586688 - time (sec): 7.35 - samples/sec: 2895.07 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 18:05:50,344 epoch 9 - iter 84/146 - loss 0.01537808 - time (sec): 8.57 - samples/sec: 2959.60 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:05:51,778 epoch 9 - iter 98/146 - loss 0.01688221 - time (sec): 10.00 - samples/sec: 2940.78 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 18:05:53,187 epoch 9 - iter 112/146 - loss 0.01695533 - time (sec): 11.41 - samples/sec: 2952.98 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:05:54,787 epoch 9 - iter 126/146 - loss 0.01675493 - time (sec): 13.01 - samples/sec: 2933.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 18:05:56,392 epoch 9 - iter 140/146 - loss 0.01614276 - time (sec): 14.62 - samples/sec: 2926.73 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 18:05:56,937 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:56,938 EPOCH 9 done: loss 0.0156 - lr: 0.000006 |
|
2023-10-16 18:05:58,185 DEV : loss 0.15635186433792114 - f1-score (micro avg) 0.7368 |
|
2023-10-16 18:05:58,190 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:05:59,477 epoch 10 - iter 14/146 - loss 0.00619091 - time (sec): 1.29 - samples/sec: 2928.89 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:06:01,000 epoch 10 - iter 28/146 - loss 0.00503078 - time (sec): 2.81 - samples/sec: 3186.76 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 18:06:02,933 epoch 10 - iter 42/146 - loss 0.01384968 - time (sec): 4.74 - samples/sec: 3059.21 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:06:04,330 epoch 10 - iter 56/146 - loss 0.01118529 - time (sec): 6.14 - samples/sec: 3127.52 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 18:06:05,732 epoch 10 - iter 70/146 - loss 0.01020773 - time (sec): 7.54 - samples/sec: 3083.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:06:07,104 epoch 10 - iter 84/146 - loss 0.01045028 - time (sec): 8.91 - samples/sec: 3031.46 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 18:06:08,416 epoch 10 - iter 98/146 - loss 0.00963056 - time (sec): 10.22 - samples/sec: 3020.03 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:06:09,777 epoch 10 - iter 112/146 - loss 0.01023756 - time (sec): 11.59 - samples/sec: 2991.65 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 18:06:11,036 epoch 10 - iter 126/146 - loss 0.01175292 - time (sec): 12.84 - samples/sec: 2989.54 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 18:06:12,247 epoch 10 - iter 140/146 - loss 0.01174241 - time (sec): 14.06 - samples/sec: 3014.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 18:06:12,946 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:12,946 EPOCH 10 done: loss 0.0119 - lr: 0.000000 |
|
2023-10-16 18:06:14,485 DEV : loss 0.16552899777889252 - f1-score (micro avg) 0.7468 |
|
2023-10-16 18:06:14,965 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 18:06:14,967 Loading model from best epoch ... |
|
2023-10-16 18:06:16,437 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 18:06:18,838 |
|
Results: |
|
- F-score (micro) 0.7565 |
|
- F-score (macro) 0.6684 |
|
- Accuracy 0.6286 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7958 0.8621 0.8276 348 |
|
LOC 0.6524 0.8199 0.7267 261 |
|
ORG 0.4773 0.4038 0.4375 52 |
|
HumanProd 0.6818 0.6818 0.6818 22 |
|
|
|
micro avg 0.7134 0.8053 0.7565 683 |
|
macro avg 0.6518 0.6919 0.6684 683 |
|
weighted avg 0.7131 0.8053 0.7546 683 |
|
|
|
2023-10-16 18:06:18,838 ---------------------------------------------------------------------------------------------------- |
|
|