2024-09-04 18:26:58.019800: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2024-09-04 18:26:58.038161: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-09-04 18:26:58.059897: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-09-04 18:26:58.066439: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2024-09-04 18:26:58.082659: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-09-04 18:26:59.362821: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT /usr/local/lib/python3.10/dist-packages/transformers/training_args.py:1525: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use `eval_strategy` instead warnings.warn( 09/04/2024 18:27:00 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False 09/04/2024 18:27:00 - INFO - __main__ - Training/evaluation parameters TrainingArguments( _n_gpu=1, accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False}, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, batch_eval_metrics=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_persistent_workers=False, dataloader_pin_memory=True, dataloader_prefetch_factor=None, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_do_concat_batches=True, eval_on_start=False, eval_steps=None, eval_strategy=epoch, eval_use_gather_object=False, evaluation_strategy=epoch, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gradient_accumulation_steps=2, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=True, group_by_length=False, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, include_num_input_tokens_seen=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=5e-05, length_column_name=length, load_best_model_at_end=True, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/content/dissertation/scripts/ner/output/tb, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=f1, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=10.0, optim=adamw_torch, optim_args=None, optim_target_modules=None, output_dir=/content/dissertation/scripts/ner/output, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=8, per_device_train_batch_size=32, prediction_loss_only=False, push_to_hub=True, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], restore_callback_states_from_checkpoint=False, resume_from_checkpoint=None, run_name=/content/dissertation/scripts/ner/output, save_on_each_node=False, save_only_model=False, save_safetensors=True, save_steps=500, save_strategy=epoch, save_total_limit=None, seed=42, skip_memory_metrics=True, split_batches=None, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torch_empty_cache_steps=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, ) Downloading builder script: 0%| | 0.00/3.91k [00:00> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-04 18:27:16,984 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "finetuning_task": "ner", "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "O", "1": "B-SINTOMA", "2": "I-SINTOMA" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "B-SINTOMA": 1, "I-SINTOMA": 2, "O": 0 }, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|configuration_utils.py:733] 2024-09-04 18:27:17,622 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-04 18:27:17,623 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,902 >> loading file vocab.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/vocab.json [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,902 >> loading file merges.txt from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/merges.txt [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,902 >> loading file tokenizer.json from cache at None [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,903 >> loading file added_tokens.json from cache at None [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,903 >> loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/special_tokens_map.json [INFO|tokenization_utils_base.py:2269] 2024-09-04 18:27:21,903 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/tokenizer_config.json [INFO|configuration_utils.py:733] 2024-09-04 18:27:21,903 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-04 18:27:21,904 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( [INFO|configuration_utils.py:733] 2024-09-04 18:27:21,979 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json [INFO|configuration_utils.py:800] 2024-09-04 18:27:21,981 >> Model config RobertaConfig { "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.44.2", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50262 } [INFO|modeling_utils.py:3678] 2024-09-04 18:27:26,079 >> loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/pytorch_model.bin [INFO|modeling_utils.py:4497] 2024-09-04 18:27:26,219 >> Some weights of the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es were not used when initializing RobertaForTokenClassification: ['lm_head.bias', 'lm_head.decoder.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight'] - This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). [WARNING|modeling_utils.py:4509] 2024-09-04 18:27:26,219 >> Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Map: 0%| | 0/15848 [00:00> The following columns in the training set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:2134] 2024-09-04 18:27:35,302 >> ***** Running training ***** [INFO|trainer.py:2135] 2024-09-04 18:27:35,302 >> Num examples = 15,848 [INFO|trainer.py:2136] 2024-09-04 18:27:35,302 >> Num Epochs = 10 [INFO|trainer.py:2137] 2024-09-04 18:27:35,302 >> Instantaneous batch size per device = 32 [INFO|trainer.py:2140] 2024-09-04 18:27:35,302 >> Total train batch size (w. parallel, distributed & accumulation) = 64 [INFO|trainer.py:2141] 2024-09-04 18:27:35,302 >> Gradient Accumulation steps = 2 [INFO|trainer.py:2142] 2024-09-04 18:27:35,302 >> Total optimization steps = 2,480 [INFO|trainer.py:2143] 2024-09-04 18:27:35,303 >> Number of trainable parameters = 124,055,043 0%| | 0/2480 [00:00> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:28:37,835 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:28:37,835 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:28:37,835 >> Batch size = 8 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-248 [INFO|configuration_utils.py:472] 2024-09-04 18:28:43,351 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-248/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:28:44,360 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-248/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:28:44,361 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-248/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:28:44,362 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-248/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:28:46,425 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:28:46,425 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 10%|█ | 249/2480 [01:11<1:44:55, 2.82s/it] 10%|█ | 250/2480 [01:11<1:15:54, 2.04s/it] 10%|█ | 251/2480 [01:11<55:54, 1.51s/it] 10%|█ | 252/2480 [01:12<41:42, 1.12s/it] 10%|█ | 253/2480 [01:12<31:32, 1.18it/s] 10%|█ | 254/2480 [01:12<24:51, 1.49it/s] 10%|█ | 255/2480 [01:12<21:00, 1.77it/s] 10%|█ | 256/2480 [01:13<17:50, 2.08it/s] 10%|█ | 257/2480 [01:13<15:06, 2.45it/s] 10%|█ | 258/2480 [01:13<12:55, 2.87it/s] 10%|█ | 259/2480 [01:13<11:19, 3.27it/s] 10%|█ | 260/2480 [01:14<10:42, 3.45it/s] 11%|█ | 261/2480 [01:14<10:07, 3.65it/s] 11%|█ | 262/2480 [01:14<09:25, 3.92it/s] 11%|█ | 263/2480 [01:14<09:55, 3.72it/s] 11%|█ | 264/2480 [01:15<10:10, 3.63it/s] 11%|█ | 265/2480 [01:15<10:24, 3.55it/s] 11%|█ | 266/2480 [01:15<09:56, 3.71it/s] 11%|█ | 267/2480 [01:15<09:49, 3.75it/s] 11%|█ | 268/2480 [01:16<09:37, 3.83it/s] 11%|█ | 269/2480 [01:16<09:43, 3.79it/s] 11%|█ | 270/2480 [01:16<09:21, 3.94it/s] 11%|█ | 271/2480 [01:16<09:51, 3.74it/s] 11%|█ | 272/2480 [01:17<09:45, 3.77it/s] 11%|█ | 273/2480 [01:17<09:37, 3.82it/s] 11%|█ | 274/2480 [01:17<09:15, 3.97it/s] 11%|█ | 275/2480 [01:18<09:48, 3.75it/s] 11%|█ | 276/2480 [01:18<09:30, 3.86it/s] 11%|█ | 277/2480 [01:18<08:50, 4.15it/s] 11%|█ | 278/2480 [01:18<08:05, 4.53it/s] 11%|█▏ | 279/2480 [01:18<09:13, 3.98it/s] 11%|█▏ | 280/2480 [01:19<08:55, 4.11it/s] 11%|█▏ | 281/2480 [01:19<08:30, 4.31it/s] 11%|█▏ | 282/2480 [01:19<08:31, 4.29it/s] 11%|█▏ | 283/2480 [01:19<08:42, 4.21it/s] 11%|█▏ | 284/2480 [01:20<08:57, 4.09it/s] 11%|█▏ | 285/2480 [01:20<08:46, 4.17it/s] 12%|█▏ | 286/2480 [01:20<08:15, 4.43it/s] 12%|█▏ | 287/2480 [01:20<07:45, 4.71it/s] 12%|█▏ | 288/2480 [01:21<08:39, 4.22it/s] 12%|█▏ | 289/2480 [01:21<09:57, 3.67it/s] 12%|█▏ | 290/2480 [01:21<09:34, 3.81it/s] 12%|█▏ | 291/2480 [01:21<09:32, 3.82it/s] 12%|█▏ | 292/2480 [01:22<09:35, 3.80it/s] 12%|█▏ | 293/2480 [01:22<09:23, 3.88it/s] 12%|█▏ | 294/2480 [01:22<08:56, 4.08it/s] 12%|█▏ | 295/2480 [01:22<09:27, 3.85it/s] 12%|█▏ | 296/2480 [01:23<08:49, 4.12it/s] 12%|█▏ | 297/2480 [01:23<08:55, 4.08it/s] 12%|█▏ | 298/2480 [01:23<08:33, 4.25it/s] 12%|█▏ | 299/2480 [01:23<09:39, 3.76it/s] 12%|█▏ | 300/2480 [01:24<09:25, 3.86it/s] 12%|█▏ | 301/2480 [01:24<08:34, 4.23it/s] 12%|█▏ | 302/2480 [01:24<07:59, 4.54it/s] 12%|█▏ | 303/2480 [01:25<11:53, 3.05it/s] 12%|█▏ | 304/2480 [01:25<11:51, 3.06it/s] 12%|█▏ | 305/2480 [01:25<10:15, 3.54it/s] 12%|█▏ | 306/2480 [01:25<09:49, 3.69it/s] 12%|█▏ | 307/2480 [01:26<09:18, 3.89it/s] 12%|█▏ | 308/2480 [01:26<09:14, 3.92it/s] 12%|█▏ | 309/2480 [01:26<08:48, 4.11it/s] 12%|█▎ | 310/2480 [01:26<08:11, 4.42it/s] 13%|█▎ | 311/2480 [01:26<07:49, 4.62it/s] 13%|█▎ | 312/2480 [01:27<07:32, 4.79it/s] 13%|█▎ | 313/2480 [01:27<07:56, 4.55it/s] 13%|█▎ | 314/2480 [01:27<07:45, 4.65it/s] 13%|█▎ | 315/2480 [01:27<09:12, 3.92it/s] 13%|█▎ | 316/2480 [01:28<09:02, 3.99it/s] 13%|█▎ | 317/2480 [01:28<08:47, 4.10it/s] 13%|█▎ | 318/2480 [01:28<09:01, 3.99it/s] 13%|█▎ | 319/2480 [01:28<08:19, 4.32it/s] 13%|█▎ | 320/2480 [01:29<08:30, 4.23it/s] 13%|█▎ | 321/2480 [01:29<08:24, 4.28it/s] 13%|█▎ | 322/2480 [01:29<09:15, 3.89it/s] 13%|█▎ | 323/2480 [01:29<09:26, 3.81it/s] 13%|█▎ | 324/2480 [01:30<10:55, 3.29it/s] 13%|█▎ | 325/2480 [01:30<09:29, 3.78it/s] 13%|█▎ | 326/2480 [01:30<11:22, 3.16it/s] 13%|█▎ | 327/2480 [01:31<10:10, 3.52it/s] 13%|█▎ | 328/2480 [01:31<09:23, 3.82it/s] 13%|█▎ | 329/2480 [01:31<09:28, 3.79it/s] 13%|█▎ | 330/2480 [01:31<09:19, 3.84it/s] 13%|█▎ | 331/2480 [01:32<08:50, 4.05it/s] 13%|█▎ | 332/2480 [01:32<08:49, 4.05it/s] 13%|█▎ | 333/2480 [01:32<09:10, 3.90it/s] 13%|█▎ | 334/2480 [01:32<08:54, 4.02it/s] 14%|█▎ | 335/2480 [01:33<09:15, 3.86it/s] 14%|█▎ | 336/2480 [01:33<08:58, 3.98it/s] 14%|█▎ | 337/2480 [01:33<08:27, 4.22it/s] 14%|█▎ | 338/2480 [01:33<09:11, 3.89it/s] 14%|█▎ | 339/2480 [01:34<08:56, 3.99it/s] 14%|█▎ | 340/2480 [01:34<08:26, 4.22it/s] 14%|█▍ | 341/2480 [01:34<08:03, 4.42it/s] 14%|█▍ | 342/2480 [01:34<08:36, 4.14it/s] 14%|█▍ | 343/2480 [01:35<08:30, 4.18it/s] 14%|█▍ | 344/2480 [01:35<09:19, 3.81it/s] 14%|█▍ | 345/2480 [01:35<08:34, 4.15it/s] 14%|█▍ | 346/2480 [01:35<08:08, 4.36it/s] 14%|█▍ | 347/2480 [01:35<07:56, 4.48it/s] 14%|█▍ | 348/2480 [01:36<07:23, 4.81it/s] 14%|█▍ | 349/2480 [01:36<07:13, 4.91it/s] 14%|█▍ | 350/2480 [01:36<07:20, 4.84it/s] 14%|█▍ | 351/2480 [01:36<07:24, 4.78it/s] 14%|█▍ | 352/2480 [01:36<08:01, 4.42it/s] 14%|█▍ | 353/2480 [01:37<08:36, 4.11it/s] 14%|█▍ | 354/2480 [01:37<08:28, 4.18it/s] 14%|█▍ | 355/2480 [01:37<08:44, 4.05it/s] 14%|█▍ | 356/2480 [01:37<07:58, 4.44it/s] 14%|█▍ | 357/2480 [01:38<08:08, 4.35it/s] 14%|█▍ | 358/2480 [01:38<07:42, 4.59it/s] 14%|█▍ | 359/2480 [01:38<07:46, 4.55it/s] 15%|█▍ | 360/2480 [01:38<07:13, 4.89it/s] 15%|█▍ | 361/2480 [01:39<07:37, 4.63it/s] 15%|█▍ | 362/2480 [01:39<08:03, 4.38it/s] 15%|█▍ | 363/2480 [01:39<08:08, 4.34it/s] 15%|█▍ | 364/2480 [01:39<08:10, 4.31it/s] 15%|█▍ | 365/2480 [01:39<07:53, 4.46it/s] 15%|█▍ | 366/2480 [01:40<08:12, 4.29it/s] 15%|█▍ | 367/2480 [01:40<07:50, 4.49it/s] 15%|█▍ | 368/2480 [01:40<09:52, 3.56it/s] 15%|█▍ | 369/2480 [01:41<09:27, 3.72it/s] 15%|█▍ | 370/2480 [01:41<09:19, 3.77it/s] 15%|█▍ | 371/2480 [01:41<09:02, 3.89it/s] 15%|█▌ | 372/2480 [01:41<08:39, 4.06it/s] 15%|█▌ | 373/2480 [01:42<09:22, 3.75it/s] 15%|█▌ | 374/2480 [01:42<08:36, 4.08it/s] 15%|█▌ | 375/2480 [01:42<09:23, 3.74it/s] 15%|█▌ | 376/2480 [01:42<08:49, 3.97it/s] 15%|█▌ | 377/2480 [01:43<08:18, 4.22it/s] 15%|█▌ | 378/2480 [01:43<07:47, 4.49it/s] 15%|█▌ | 379/2480 [01:43<08:04, 4.34it/s] 15%|█▌ | 380/2480 [01:43<08:11, 4.27it/s] 15%|█▌ | 381/2480 [01:43<08:26, 4.14it/s] 15%|█▌ | 382/2480 [01:44<08:41, 4.02it/s] 15%|█▌ | 383/2480 [01:44<08:20, 4.19it/s] 15%|█▌ | 384/2480 [01:44<08:18, 4.21it/s] 16%|█▌ | 385/2480 [01:44<08:01, 4.35it/s] 16%|█▌ | 386/2480 [01:45<08:05, 4.32it/s] 16%|█▌ | 387/2480 [01:45<08:09, 4.28it/s] 16%|█▌ | 388/2480 [01:45<07:40, 4.54it/s] 16%|█▌ | 389/2480 [01:45<07:19, 4.76it/s] 16%|█▌ | 390/2480 [01:46<08:15, 4.21it/s] 16%|█▌ | 391/2480 [01:46<08:16, 4.21it/s] 16%|█▌ | 392/2480 [01:46<08:03, 4.32it/s] 16%|█▌ | 393/2480 [01:46<07:37, 4.56it/s] 16%|█▌ | 394/2480 [01:46<07:51, 4.42it/s] 16%|█▌ | 395/2480 [01:47<07:30, 4.63it/s] 16%|█▌ | 396/2480 [01:47<07:22, 4.71it/s] 16%|█▌ | 397/2480 [01:47<07:09, 4.85it/s] 16%|█▌ | 398/2480 [01:47<07:15, 4.78it/s] 16%|█▌ | 399/2480 [01:47<07:29, 4.63it/s] 16%|█▌ | 400/2480 [01:48<08:38, 4.01it/s] 16%|█▌ | 401/2480 [01:48<08:26, 4.11it/s] 16%|█▌ | 402/2480 [01:48<08:07, 4.26it/s] 16%|█▋ | 403/2480 [01:49<09:51, 3.51it/s] 16%|█▋ | 404/2480 [01:49<09:03, 3.82it/s] 16%|█▋ | 405/2480 [01:49<09:01, 3.83it/s] 16%|█▋ | 406/2480 [01:49<08:35, 4.02it/s] 16%|█▋ | 407/2480 [01:50<09:48, 3.52it/s] 16%|█▋ | 408/2480 [01:50<10:05, 3.42it/s] 16%|█▋ | 409/2480 [01:50<09:18, 3.71it/s] 17%|█▋ | 410/2480 [01:50<08:44, 3.95it/s] 17%|█▋ | 411/2480 [01:51<09:09, 3.76it/s] 17%|█▋ | 412/2480 [01:51<09:02, 3.81it/s] 17%|█▋ | 413/2480 [01:51<08:49, 3.90it/s] 17%|█▋ | 414/2480 [01:51<08:50, 3.89it/s] 17%|█▋ | 415/2480 [01:52<08:39, 3.97it/s] 17%|█▋ | 416/2480 [01:52<08:59, 3.83it/s] 17%|█▋ | 417/2480 [01:52<08:24, 4.09it/s] 17%|█▋ | 418/2480 [01:52<08:14, 4.17it/s] 17%|█▋ | 419/2480 [01:53<07:55, 4.33it/s] 17%|█▋ | 420/2480 [01:53<07:56, 4.32it/s] 17%|█▋ | 421/2480 [01:53<08:04, 4.25it/s] 17%|█▋ | 422/2480 [01:53<07:47, 4.40it/s] 17%|█▋ | 423/2480 [01:54<08:00, 4.28it/s] 17%|█▋ | 424/2480 [01:54<09:22, 3.66it/s] 17%|█▋ | 425/2480 [01:54<09:00, 3.80it/s] 17%|█▋ | 426/2480 [01:54<08:43, 3.92it/s] 17%|█▋ | 427/2480 [01:55<08:30, 4.02it/s] 17%|█▋ | 428/2480 [01:55<09:18, 3.68it/s] 17%|█▋ | 429/2480 [01:55<08:33, 3.99it/s] 17%|█▋ | 430/2480 [01:55<09:00, 3.79it/s] 17%|█▋ | 431/2480 [01:56<08:53, 3.84it/s] 17%|█▋ | 432/2480 [01:56<08:45, 3.90it/s] 17%|█▋ | 433/2480 [01:56<08:07, 4.20it/s] 18%|█▊ | 434/2480 [01:56<08:10, 4.17it/s] 18%|█▊ | 435/2480 [01:57<07:57, 4.28it/s] 18%|█▊ | 436/2480 [01:57<08:58, 3.79it/s] 18%|█▊ | 437/2480 [01:57<08:49, 3.86it/s] 18%|█▊ | 438/2480 [01:57<08:47, 3.87it/s] 18%|█▊ | 439/2480 [01:58<08:53, 3.82it/s] 18%|█▊ | 440/2480 [01:58<10:21, 3.28it/s] 18%|█▊ | 441/2480 [01:58<09:35, 3.54it/s] 18%|█▊ | 442/2480 [01:59<09:07, 3.72it/s] 18%|█▊ | 443/2480 [01:59<08:27, 4.02it/s] 18%|█▊ | 444/2480 [01:59<08:47, 3.86it/s] 18%|█▊ | 445/2480 [01:59<08:12, 4.13it/s] 18%|█▊ | 446/2480 [02:00<08:45, 3.87it/s] 18%|█▊ | 447/2480 [02:00<08:20, 4.06it/s] 18%|█▊ | 448/2480 [02:00<08:11, 4.14it/s] 18%|█▊ | 449/2480 [02:00<07:57, 4.25it/s] 18%|█▊ | 450/2480 [02:00<07:52, 4.30it/s] 18%|█▊ | 451/2480 [02:01<08:03, 4.19it/s] 18%|█▊ | 452/2480 [02:01<07:23, 4.58it/s] 18%|█▊ | 453/2480 [02:01<07:13, 4.67it/s] 18%|█▊ | 454/2480 [02:01<07:35, 4.44it/s] 18%|█▊ | 455/2480 [02:02<07:57, 4.24it/s] 18%|█▊ | 456/2480 [02:02<09:04, 3.72it/s] 18%|█▊ | 457/2480 [02:02<08:29, 3.97it/s] 18%|█▊ | 458/2480 [02:02<08:20, 4.04it/s] 19%|█▊ | 459/2480 [02:03<08:40, 3.89it/s] 19%|█▊ | 460/2480 [02:03<08:34, 3.93it/s] 19%|█▊ | 461/2480 [02:03<09:17, 3.62it/s] 19%|█▊ | 462/2480 [02:04<08:58, 3.75it/s] 19%|█▊ | 463/2480 [02:04<08:22, 4.01it/s] 19%|█▊ | 464/2480 [02:04<08:37, 3.90it/s] 19%|█▉ | 465/2480 [02:04<08:16, 4.06it/s] 19%|█▉ | 466/2480 [02:04<08:27, 3.97it/s] 19%|█▉ | 467/2480 [02:05<08:03, 4.16it/s] 19%|█▉ | 468/2480 [02:05<08:03, 4.16it/s] 19%|█▉ | 469/2480 [02:05<07:37, 4.40it/s] 19%|█▉ | 470/2480 [02:05<07:38, 4.38it/s] 19%|█▉ | 471/2480 [02:06<08:27, 3.96it/s] 19%|█▉ | 472/2480 [02:06<08:40, 3.86it/s] 19%|█▉ | 473/2480 [02:06<08:54, 3.75it/s] 19%|█▉ | 474/2480 [02:06<07:56, 4.21it/s] 19%|█▉ | 475/2480 [02:07<07:29, 4.46it/s] 19%|█▉ | 476/2480 [02:07<09:12, 3.63it/s] 19%|█▉ | 477/2480 [02:07<08:23, 3.98it/s] 19%|█▉ | 478/2480 [02:07<08:58, 3.72it/s] 19%|█▉ | 479/2480 [02:08<08:17, 4.02it/s] 19%|█▉ | 480/2480 [02:08<07:59, 4.17it/s] 19%|█▉ | 481/2480 [02:08<07:47, 4.28it/s] 19%|█▉ | 482/2480 [02:08<07:34, 4.39it/s] 19%|█▉ | 483/2480 [02:09<07:12, 4.62it/s] 20%|█▉ | 484/2480 [02:09<07:31, 4.42it/s] 20%|█▉ | 485/2480 [02:09<07:12, 4.61it/s] 20%|█▉ | 486/2480 [02:09<07:16, 4.57it/s] 20%|█▉ | 487/2480 [02:09<07:32, 4.40it/s] 20%|█▉ | 488/2480 [02:10<07:54, 4.20it/s] 20%|█▉ | 489/2480 [02:10<07:57, 4.17it/s] 20%|█▉ | 490/2480 [02:10<07:55, 4.19it/s] 20%|█▉ | 491/2480 [02:10<07:28, 4.44it/s] 20%|█▉ | 492/2480 [02:11<08:23, 3.95it/s] 20%|█▉ | 493/2480 [02:11<08:07, 4.07it/s] 20%|█▉ | 494/2480 [02:11<08:11, 4.04it/s] 20%|█▉ | 495/2480 [02:11<08:04, 4.10it/s] 20%|██ | 496/2480 [02:12<07:29, 4.42it/s][INFO|trainer.py:811] 2024-09-04 18:29:47,424 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:29:47,426 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:29:47,426 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:29:47,426 >> Batch size = 8 {'eval_loss': 0.16486208140850067, 'eval_precision': 0.5941821649976157, 'eval_recall': 0.6819923371647509, 'eval_f1': 0.63506625891947, 'eval_accuracy': 0.9477686162533286, 'eval_runtime': 5.5136, 'eval_samples_per_second': 456.871, 'eval_steps_per_second': 57.132, 'epoch': 1.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-496 [INFO|configuration_utils.py:472] 2024-09-04 18:29:52,847 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-496/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:29:53,893 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-496/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:29:53,894 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-496/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:29:53,895 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-496/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:29:56,132 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:29:56,136 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 20%|██ | 497/2480 [02:21<1:35:08, 2.88s/it] 20%|██ | 498/2480 [02:21<1:08:47, 2.08s/it] 20%|██ | 499/2480 [02:21<49:50, 1.51s/it] 20%|██ | 500/2480 [02:21<37:41, 1.14s/it] 20%|██ | 500/2480 [02:21<37:41, 1.14s/it] 20%|██ | 501/2480 [02:22<28:28, 1.16it/s] 20%|██ | 502/2480 [02:22<21:52, 1.51it/s] 20%|██ | 503/2480 [02:22<17:38, 1.87it/s] 20%|██ | 504/2480 [02:22<15:07, 2.18it/s] 20%|██ | 505/2480 [02:22<12:29, 2.63it/s] 20%|██ | 506/2480 [02:23<11:15, 2.92it/s] 20%|██ | 507/2480 [02:23<11:05, 2.96it/s] 20%|██ | 508/2480 [02:23<09:48, 3.35it/s] 21%|██ | 509/2480 [02:24<09:26, 3.48it/s] 21%|██ | 510/2480 [02:24<09:18, 3.53it/s] 21%|██ | 511/2480 [02:24<09:44, 3.37it/s] 21%|██ | 512/2480 [02:24<08:59, 3.65it/s] 21%|██ | 513/2480 [02:25<08:58, 3.66it/s] 21%|██ | 514/2480 [02:25<09:39, 3.39it/s] 21%|██ | 515/2480 [02:25<09:35, 3.42it/s] 21%|██ | 516/2480 [02:26<09:03, 3.61it/s] 21%|██ | 517/2480 [02:26<08:40, 3.77it/s] 21%|██ | 518/2480 [02:26<08:33, 3.82it/s] 21%|██ | 519/2480 [02:26<07:53, 4.14it/s] 21%|██ | 520/2480 [02:26<07:46, 4.21it/s] 21%|██ | 521/2480 [02:27<07:01, 4.65it/s] 21%|██ | 522/2480 [02:27<07:25, 4.39it/s] 21%|██ | 523/2480 [02:27<07:28, 4.37it/s] 21%|██ | 524/2480 [02:27<08:01, 4.06it/s] 21%|██ | 525/2480 [02:28<07:55, 4.11it/s] 21%|██ | 526/2480 [02:28<07:39, 4.25it/s] 21%|██▏ | 527/2480 [02:28<07:29, 4.34it/s] 21%|██▏ | 528/2480 [02:28<07:13, 4.50it/s] 21%|██▏ | 529/2480 [02:28<07:40, 4.24it/s] 21%|██▏ | 530/2480 [02:29<07:51, 4.14it/s] 21%|██▏ | 531/2480 [02:29<07:24, 4.38it/s] 21%|██▏ | 532/2480 [02:29<07:03, 4.60it/s] 21%|██▏ | 533/2480 [02:29<07:08, 4.54it/s] 22%|██▏ | 534/2480 [02:30<07:13, 4.49it/s] 22%|██▏ | 535/2480 [02:30<07:11, 4.51it/s] 22%|██▏ | 536/2480 [02:30<06:54, 4.69it/s] 22%|██▏ | 537/2480 [02:30<07:17, 4.44it/s] 22%|██▏ | 538/2480 [02:30<06:59, 4.63it/s] 22%|██▏ | 539/2480 [02:31<06:58, 4.64it/s] 22%|██▏ | 540/2480 [02:31<08:03, 4.01it/s] 22%|██▏ | 541/2480 [02:31<07:53, 4.10it/s] 22%|██▏ | 542/2480 [02:31<07:23, 4.37it/s] 22%|██▏ | 543/2480 [02:32<08:08, 3.96it/s] 22%|██▏ | 544/2480 [02:32<07:53, 4.09it/s] 22%|██▏ | 545/2480 [02:32<07:26, 4.33it/s] 22%|██▏ | 546/2480 [02:32<07:09, 4.51it/s] 22%|██▏ | 547/2480 [02:33<06:59, 4.61it/s] 22%|██▏ | 548/2480 [02:33<06:46, 4.75it/s] 22%|██▏ | 549/2480 [02:33<06:53, 4.67it/s] 22%|██▏ | 550/2480 [02:33<06:57, 4.62it/s] 22%|██▏ | 551/2480 [02:34<07:58, 4.03it/s] 22%|██▏ | 552/2480 [02:34<07:30, 4.28it/s] 22%|██▏ | 553/2480 [02:34<07:38, 4.20it/s] 22%|██▏ | 554/2480 [02:34<07:55, 4.05it/s] 22%|██▏ | 555/2480 [02:34<08:00, 4.01it/s] 22%|██▏ | 556/2480 [02:35<07:51, 4.08it/s] 22%|██▏ | 557/2480 [02:35<07:58, 4.02it/s] 22%|██▎ | 558/2480 [02:35<08:14, 3.89it/s] 23%|██▎ | 559/2480 [02:36<08:22, 3.82it/s] 23%|██▎ | 560/2480 [02:36<08:08, 3.93it/s] 23%|██▎ | 561/2480 [02:36<08:28, 3.77it/s] 23%|██▎ | 562/2480 [02:36<08:00, 3.99it/s] 23%|██▎ | 563/2480 [02:37<09:25, 3.39it/s] 23%|██▎ | 564/2480 [02:37<09:28, 3.37it/s] 23%|██▎ | 565/2480 [02:37<09:10, 3.48it/s] 23%|██▎ | 566/2480 [02:37<08:38, 3.69it/s] 23%|██▎ | 567/2480 [02:38<08:00, 3.99it/s] 23%|██▎ | 568/2480 [02:38<07:39, 4.16it/s] 23%|██▎ | 569/2480 [02:38<07:12, 4.42it/s] 23%|██▎ | 570/2480 [02:38<06:51, 4.64it/s] 23%|██▎ | 571/2480 [02:39<07:39, 4.16it/s] 23%|██▎ | 572/2480 [02:39<07:17, 4.36it/s] 23%|██▎ | 573/2480 [02:39<07:12, 4.41it/s] 23%|██▎ | 574/2480 [02:39<07:04, 4.49it/s] 23%|██▎ | 575/2480 [02:39<06:57, 4.56it/s] 23%|██▎ | 576/2480 [02:40<07:21, 4.31it/s] 23%|██▎ | 577/2480 [02:40<07:45, 4.09it/s] 23%|██▎ | 578/2480 [02:40<07:39, 4.14it/s] 23%|██▎ | 579/2480 [02:41<08:14, 3.85it/s] 23%|██▎ | 580/2480 [02:41<07:37, 4.15it/s] 23%|██▎ | 581/2480 [02:41<07:11, 4.40it/s] 23%|██▎ | 582/2480 [02:41<06:47, 4.65it/s] 24%|██▎ | 583/2480 [02:41<07:08, 4.43it/s] 24%|██▎ | 584/2480 [02:42<07:34, 4.17it/s] 24%|██▎ | 585/2480 [02:42<07:11, 4.39it/s] 24%|██▎ | 586/2480 [02:42<07:11, 4.38it/s] 24%|██▎ | 587/2480 [02:42<06:50, 4.61it/s] 24%|██▎ | 588/2480 [02:42<07:05, 4.45it/s] 24%|██▍ | 589/2480 [02:43<07:37, 4.13it/s] 24%|██▍ | 590/2480 [02:43<08:56, 3.52it/s] 24%|██▍ | 591/2480 [02:43<08:35, 3.67it/s] 24%|██▍ | 592/2480 [02:44<08:26, 3.73it/s] 24%|██▍ | 593/2480 [02:44<08:03, 3.91it/s] 24%|██▍ | 594/2480 [02:44<07:44, 4.06it/s] 24%|██▍ | 595/2480 [02:44<07:57, 3.95it/s] 24%|██▍ | 596/2480 [02:45<07:22, 4.25it/s] 24%|██▍ | 597/2480 [02:45<07:11, 4.37it/s] 24%|██▍ | 598/2480 [02:45<08:57, 3.50it/s] 24%|██▍ | 599/2480 [02:45<08:42, 3.60it/s] 24%|██▍ | 600/2480 [02:46<07:43, 4.06it/s] 24%|██▍ | 601/2480 [02:46<07:36, 4.12it/s] 24%|██▍ | 602/2480 [02:46<09:01, 3.47it/s] 24%|██▍ | 603/2480 [02:47<09:54, 3.16it/s] 24%|██▍ | 604/2480 [02:47<09:12, 3.40it/s] 24%|██▍ | 605/2480 [02:47<08:18, 3.76it/s] 24%|██▍ | 606/2480 [02:48<10:14, 3.05it/s] 24%|██▍ | 607/2480 [02:48<09:14, 3.38it/s] 25%|██▍ | 608/2480 [02:48<08:13, 3.80it/s] 25%|██▍ | 609/2480 [02:48<07:48, 3.99it/s] 25%|██▍ | 610/2480 [02:48<07:21, 4.23it/s] 25%|██▍ | 611/2480 [02:49<07:12, 4.32it/s] 25%|██▍ | 612/2480 [02:49<07:02, 4.43it/s] 25%|██▍ | 613/2480 [02:49<07:03, 4.41it/s] 25%|██▍ | 614/2480 [02:49<06:52, 4.52it/s] 25%|██▍ | 615/2480 [02:50<07:40, 4.05it/s] 25%|██▍ | 616/2480 [02:50<07:31, 4.13it/s] 25%|██▍ | 617/2480 [02:50<07:09, 4.34it/s] 25%|██▍ | 618/2480 [02:50<08:33, 3.63it/s] 25%|██▍ | 619/2480 [02:51<08:24, 3.69it/s] 25%|██▌ | 620/2480 [02:51<08:25, 3.68it/s] 25%|██▌ | 621/2480 [02:51<07:51, 3.94it/s] 25%|██▌ | 622/2480 [02:51<07:38, 4.05it/s] 25%|██▌ | 623/2480 [02:52<07:28, 4.14it/s] 25%|██▌ | 624/2480 [02:52<07:12, 4.29it/s] 25%|██▌ | 625/2480 [02:52<07:03, 4.38it/s] 25%|██▌ | 626/2480 [02:52<07:15, 4.26it/s] 25%|██▌ | 627/2480 [02:52<07:16, 4.25it/s] 25%|██▌ | 628/2480 [02:53<07:46, 3.97it/s] 25%|██▌ | 629/2480 [02:53<07:36, 4.05it/s] 25%|██▌ | 630/2480 [02:53<07:37, 4.05it/s] 25%|██▌ | 631/2480 [02:53<07:17, 4.23it/s] 25%|██▌ | 632/2480 [02:54<07:20, 4.20it/s] 26%|██▌ | 633/2480 [02:54<07:30, 4.10it/s] 26%|██▌ | 634/2480 [02:54<07:19, 4.20it/s] 26%|██▌ | 635/2480 [02:54<07:23, 4.16it/s] 26%|██▌ | 636/2480 [02:55<07:27, 4.12it/s] 26%|██▌ | 637/2480 [02:55<07:20, 4.19it/s] 26%|██▌ | 638/2480 [02:55<07:20, 4.18it/s] 26%|██▌ | 639/2480 [02:55<07:19, 4.19it/s] 26%|██▌ | 640/2480 [02:56<08:10, 3.75it/s] 26%|██▌ | 641/2480 [02:56<07:50, 3.91it/s] 26%|██▌ | 642/2480 [02:56<07:40, 3.99it/s] 26%|██▌ | 643/2480 [02:56<07:23, 4.14it/s] 26%|██▌ | 644/2480 [02:57<07:57, 3.85it/s] 26%|██▌ | 645/2480 [02:57<07:26, 4.11it/s] 26%|██▌ | 646/2480 [02:57<07:42, 3.96it/s] 26%|██▌ | 647/2480 [02:57<07:35, 4.03it/s] 26%|██▌ | 648/2480 [02:58<07:56, 3.84it/s] 26%|██▌ | 649/2480 [02:58<07:44, 3.94it/s] 26%|██▌ | 650/2480 [02:58<07:53, 3.86it/s] 26%|██▋ | 651/2480 [02:58<07:32, 4.04it/s] 26%|██▋ | 652/2480 [02:59<07:28, 4.08it/s] 26%|██▋ | 653/2480 [02:59<08:12, 3.71it/s] 26%|██▋ | 654/2480 [02:59<07:56, 3.84it/s] 26%|██▋ | 655/2480 [02:59<07:17, 4.17it/s] 26%|██▋ | 656/2480 [03:00<06:46, 4.49it/s] 26%|██▋ | 657/2480 [03:00<06:52, 4.42it/s] 27%|██▋ | 658/2480 [03:00<07:42, 3.94it/s] 27%|██▋ | 659/2480 [03:00<07:38, 3.98it/s] 27%|██▋ | 660/2480 [03:01<08:01, 3.78it/s] 27%|██▋ | 661/2480 [03:01<07:37, 3.98it/s] 27%|██▋ | 662/2480 [03:01<07:48, 3.88it/s] 27%|██▋ | 663/2480 [03:01<07:27, 4.06it/s] 27%|██▋ | 664/2480 [03:02<07:13, 4.19it/s] 27%|██▋ | 665/2480 [03:02<06:55, 4.37it/s] 27%|██▋ | 666/2480 [03:02<06:58, 4.33it/s] 27%|██▋ | 667/2480 [03:02<06:37, 4.56it/s] 27%|██▋ | 668/2480 [03:03<07:43, 3.91it/s] 27%|██▋ | 669/2480 [03:03<07:25, 4.06it/s] 27%|██▋ | 670/2480 [03:03<07:55, 3.80it/s] 27%|██▋ | 671/2480 [03:03<07:49, 3.85it/s] 27%|██▋ | 672/2480 [03:04<08:12, 3.67it/s] 27%|██▋ | 673/2480 [03:04<07:51, 3.84it/s] 27%|██▋ | 674/2480 [03:04<07:33, 3.98it/s] 27%|██▋ | 675/2480 [03:04<07:45, 3.88it/s] 27%|██▋ | 676/2480 [03:05<07:21, 4.08it/s] 27%|██▋ | 677/2480 [03:05<06:47, 4.42it/s] 27%|██▋ | 678/2480 [03:05<07:35, 3.96it/s] 27%|██▋ | 679/2480 [03:05<08:00, 3.75it/s] 27%|██▋ | 680/2480 [03:06<07:21, 4.08it/s] 27%|██▋ | 681/2480 [03:06<06:58, 4.30it/s] 28%|██▊ | 682/2480 [03:06<07:10, 4.18it/s] 28%|██▊ | 683/2480 [03:06<06:49, 4.39it/s] 28%|██▊ | 684/2480 [03:07<06:59, 4.28it/s] 28%|██▊ | 685/2480 [03:07<06:46, 4.41it/s] 28%|██▊ | 686/2480 [03:07<06:57, 4.29it/s] 28%|██▊ | 687/2480 [03:07<06:57, 4.30it/s] 28%|██▊ | 688/2480 [03:07<06:39, 4.48it/s] 28%|██▊ | 689/2480 [03:08<07:09, 4.17it/s] 28%|██▊ | 690/2480 [03:08<06:59, 4.27it/s] 28%|██▊ | 691/2480 [03:08<08:22, 3.56it/s] 28%|██▊ | 692/2480 [03:09<07:51, 3.79it/s] 28%|██▊ | 693/2480 [03:09<08:43, 3.42it/s] 28%|██▊ | 694/2480 [03:09<08:13, 3.62it/s] 28%|██▊ | 695/2480 [03:09<08:07, 3.66it/s] 28%|██▊ | 696/2480 [03:10<08:55, 3.33it/s] 28%|██▊ | 697/2480 [03:10<07:59, 3.72it/s] 28%|██▊ | 698/2480 [03:10<07:17, 4.07it/s] 28%|██▊ | 699/2480 [03:10<07:00, 4.24it/s] 28%|██▊ | 700/2480 [03:11<06:54, 4.30it/s] 28%|██▊ | 701/2480 [03:11<06:41, 4.44it/s] 28%|██▊ | 702/2480 [03:11<07:12, 4.11it/s] 28%|██▊ | 703/2480 [03:11<07:31, 3.94it/s] 28%|██▊ | 704/2480 [03:12<06:57, 4.25it/s] 28%|██▊ | 705/2480 [03:12<07:14, 4.09it/s] 28%|██▊ | 706/2480 [03:12<07:13, 4.09it/s] 29%|██▊ | 707/2480 [03:12<07:16, 4.07it/s] 29%|██▊ | 708/2480 [03:13<07:01, 4.20it/s] 29%|██▊ | 709/2480 [03:13<07:29, 3.94it/s] 29%|██▊ | 710/2480 [03:13<07:12, 4.09it/s] 29%|██▊ | 711/2480 [03:13<06:23, 4.61it/s] 29%|██▊ | 712/2480 [03:14<07:59, 3.69it/s] 29%|██▉ | 713/2480 [03:14<07:41, 3.83it/s] 29%|██▉ | 714/2480 [03:14<07:17, 4.04it/s] 29%|██▉ | 715/2480 [03:14<07:00, 4.19it/s] 29%|██▉ | 716/2480 [03:14<06:36, 4.45it/s] 29%|██▉ | 717/2480 [03:15<07:05, 4.15it/s] 29%|██▉ | 718/2480 [03:15<07:49, 3.75it/s] 29%|██▉ | 719/2480 [03:15<08:18, 3.54it/s] 29%|██▉ | 720/2480 [03:16<08:06, 3.62it/s] 29%|██▉ | 721/2480 [03:16<07:34, 3.87it/s] 29%|██▉ | 722/2480 [03:16<08:46, 3.34it/s] 29%|██▉ | 723/2480 [03:17<08:04, 3.63it/s] 29%|██▉ | 724/2480 [03:17<07:32, 3.88it/s] 29%|██▉ | 725/2480 [03:17<07:46, 3.76it/s] 29%|██▉ | 726/2480 [03:17<07:27, 3.92it/s] 29%|██▉ | 727/2480 [03:17<07:11, 4.06it/s] 29%|██▉ | 728/2480 [03:18<07:48, 3.74it/s] 29%|██▉ | 729/2480 [03:18<07:41, 3.80it/s] 29%|██▉ | 730/2480 [03:18<07:41, 3.79it/s] 29%|██▉ | 731/2480 [03:19<07:14, 4.02it/s] 30%|██▉ | 732/2480 [03:19<07:24, 3.93it/s] 30%|██▉ | 733/2480 [03:19<06:49, 4.27it/s] 30%|██▉ | 734/2480 [03:19<07:56, 3.67it/s] 30%|██▉ | 735/2480 [03:20<07:07, 4.08it/s] 30%|██▉ | 736/2480 [03:20<07:04, 4.10it/s] 30%|██▉ | 737/2480 [03:20<06:41, 4.34it/s] 30%|██▉ | 738/2480 [03:20<06:05, 4.77it/s] 30%|██▉ | 739/2480 [03:20<05:59, 4.84it/s] 30%|██▉ | 740/2480 [03:21<06:21, 4.56it/s] 30%|██▉ | 741/2480 [03:21<06:33, 4.42it/s] 30%|██▉ | 742/2480 [03:21<06:51, 4.23it/s] 30%|██▉ | 743/2480 [03:21<07:48, 3.71it/s] 30%|███ | 744/2480 [03:22<07:19, 3.95it/s][INFO|trainer.py:811] 2024-09-04 18:30:57,444 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:30:57,446 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:30:57,446 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:30:57,446 >> Batch size = 8 {'eval_loss': 0.18148483335971832, 'eval_precision': 0.6557815845824411, 'eval_recall': 0.6704980842911877, 'eval_f1': 0.6630581867388362, 'eval_accuracy': 0.9476402836151304, 'eval_runtime': 5.4181, 'eval_samples_per_second': 464.925, 'eval_steps_per_second': 58.139, 'epoch': 2.0} {'loss': 0.134, 'grad_norm': 0.6250831484794617, 'learning_rate': 3.991935483870968e-05, 'epoch': 2.02} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-744 [INFO|configuration_utils.py:472] 2024-09-04 18:31:02,934 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-744/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:31:03,965 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-744/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:31:03,966 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-744/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:31:03,966 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-744/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:31:06,078 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:31:06,078 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 30%|███ | 745/2480 [03:31<1:25:22, 2.95s/it] 30%|███ | 746/2480 [03:31<1:02:03, 2.15s/it] 30%|███ | 747/2480 [03:31<45:34, 1.58s/it] 30%|███ | 748/2480 [03:32<33:35, 1.16s/it] 30%|███ | 749/2480 [03:32<25:15, 1.14it/s] 30%|███ | 750/2480 [03:32<20:10, 1.43it/s] 30%|███ | 751/2480 [03:32<15:54, 1.81it/s] 30%|███ | 752/2480 [03:32<12:42, 2.27it/s] 30%|███ | 753/2480 [03:33<10:23, 2.77it/s] 30%|███ | 754/2480 [03:33<09:46, 2.94it/s] 30%|███ | 755/2480 [03:33<08:49, 3.26it/s] 30%|███ | 756/2480 [03:33<08:48, 3.26it/s] 31%|███ | 757/2480 [03:34<08:11, 3.51it/s] 31%|███ | 758/2480 [03:34<07:17, 3.94it/s] 31%|███ | 759/2480 [03:34<07:50, 3.66it/s] 31%|███ | 760/2480 [03:34<07:25, 3.86it/s] 31%|███ | 761/2480 [03:35<07:09, 4.01it/s] 31%|███ | 762/2480 [03:35<06:53, 4.15it/s] 31%|███ | 763/2480 [03:35<07:15, 3.94it/s] 31%|███ | 764/2480 [03:35<07:03, 4.05it/s] 31%|███ | 765/2480 [03:36<06:53, 4.14it/s] 31%|███ | 766/2480 [03:36<07:16, 3.93it/s] 31%|███ | 767/2480 [03:36<07:19, 3.90it/s] 31%|███ | 768/2480 [03:36<06:42, 4.26it/s] 31%|███ | 769/2480 [03:37<06:19, 4.51it/s] 31%|███ | 770/2480 [03:37<07:29, 3.81it/s] 31%|███ | 771/2480 [03:37<07:49, 3.64it/s] 31%|███ | 772/2480 [03:37<07:42, 3.69it/s] 31%|███ | 773/2480 [03:38<07:46, 3.66it/s] 31%|███ | 774/2480 [03:38<07:34, 3.76it/s] 31%|███▏ | 775/2480 [03:38<07:56, 3.58it/s] 31%|███▏ | 776/2480 [03:39<07:09, 3.97it/s] 31%|███▏ | 777/2480 [03:39<06:28, 4.38it/s] 31%|███▏ | 778/2480 [03:39<06:18, 4.50it/s] 31%|███▏ | 779/2480 [03:39<06:43, 4.22it/s] 31%|███▏ | 780/2480 [03:39<06:41, 4.23it/s] 31%|███▏ | 781/2480 [03:40<06:32, 4.33it/s] 32%|███▏ | 782/2480 [03:40<06:11, 4.57it/s] 32%|███▏ | 783/2480 [03:40<06:01, 4.70it/s] 32%|███▏ | 784/2480 [03:40<06:00, 4.71it/s] 32%|███▏ | 785/2480 [03:40<06:28, 4.37it/s] 32%|███▏ | 786/2480 [03:41<06:25, 4.40it/s] 32%|███▏ | 787/2480 [03:41<06:06, 4.62it/s] 32%|███▏ | 788/2480 [03:41<05:56, 4.74it/s] 32%|███▏ | 789/2480 [03:41<05:58, 4.72it/s] 32%|███▏ | 790/2480 [03:42<06:05, 4.63it/s] 32%|███▏ | 791/2480 [03:42<06:20, 4.44it/s] 32%|███▏ | 792/2480 [03:42<06:32, 4.30it/s] 32%|███▏ | 793/2480 [03:42<06:34, 4.27it/s] 32%|███▏ | 794/2480 [03:43<07:00, 4.01it/s] 32%|███▏ | 795/2480 [03:43<06:29, 4.33it/s] 32%|███▏ | 796/2480 [03:43<07:34, 3.71it/s] 32%|███▏ | 797/2480 [03:43<07:01, 3.99it/s] 32%|███▏ | 798/2480 [03:44<06:46, 4.14it/s] 32%|███▏ | 799/2480 [03:44<06:37, 4.23it/s] 32%|███▏ | 800/2480 [03:44<07:00, 3.99it/s] 32%|███▏ | 801/2480 [03:44<07:07, 3.92it/s] 32%|███▏ | 802/2480 [03:45<06:47, 4.12it/s] 32%|███▏ | 803/2480 [03:45<06:36, 4.23it/s] 32%|███▏ | 804/2480 [03:45<07:08, 3.91it/s] 32%|███▏ | 805/2480 [03:45<08:30, 3.28it/s] 32%|███▎ | 806/2480 [03:46<07:53, 3.54it/s] 33%|███▎ | 807/2480 [03:46<07:11, 3.88it/s] 33%|███▎ | 808/2480 [03:46<07:12, 3.87it/s] 33%|███▎ | 809/2480 [03:46<07:43, 3.61it/s] 33%|███▎ | 810/2480 [03:47<07:39, 3.64it/s] 33%|███▎ | 811/2480 [03:47<07:33, 3.68it/s] 33%|███▎ | 812/2480 [03:47<06:52, 4.04it/s] 33%|███▎ | 813/2480 [03:47<06:54, 4.03it/s] 33%|███▎ | 814/2480 [03:48<07:02, 3.94it/s] 33%|███▎ | 815/2480 [03:48<06:48, 4.08it/s] 33%|███▎ | 816/2480 [03:48<06:54, 4.02it/s] 33%|███▎ | 817/2480 [03:48<07:08, 3.88it/s] 33%|███▎ | 818/2480 [03:49<06:56, 3.99it/s] 33%|███▎ | 819/2480 [03:49<06:27, 4.28it/s] 33%|███▎ | 820/2480 [03:49<06:18, 4.39it/s] 33%|███▎ | 821/2480 [03:49<06:14, 4.43it/s] 33%|███▎ | 822/2480 [03:50<06:17, 4.39it/s] 33%|███▎ | 823/2480 [03:50<06:14, 4.42it/s] 33%|███▎ | 824/2480 [03:50<06:12, 4.45it/s] 33%|███▎ | 825/2480 [03:50<06:35, 4.19it/s] 33%|███▎ | 826/2480 [03:51<06:27, 4.27it/s] 33%|███▎ | 827/2480 [03:51<06:17, 4.38it/s] 33%|███▎ | 828/2480 [03:51<06:06, 4.51it/s] 33%|███▎ | 829/2480 [03:51<06:07, 4.49it/s] 33%|███▎ | 830/2480 [03:51<07:04, 3.89it/s] 34%|███▎ | 831/2480 [03:52<06:54, 3.98it/s] 34%|███▎ | 832/2480 [03:52<06:53, 3.99it/s] 34%|███▎ | 833/2480 [03:52<07:30, 3.65it/s] 34%|███▎ | 834/2480 [03:53<07:09, 3.83it/s] 34%|███▎ | 835/2480 [03:53<07:06, 3.86it/s] 34%|███▎ | 836/2480 [03:53<07:24, 3.70it/s] 34%|███▍ | 837/2480 [03:53<07:04, 3.87it/s] 34%|███▍ | 838/2480 [03:54<06:49, 4.01it/s] 34%|███▍ | 839/2480 [03:54<06:34, 4.16it/s] 34%|███▍ | 840/2480 [03:54<05:58, 4.57it/s] 34%|███▍ | 841/2480 [03:54<05:54, 4.63it/s] 34%|███▍ | 842/2480 [03:54<06:05, 4.48it/s] 34%|███▍ | 843/2480 [03:55<05:58, 4.57it/s] 34%|███▍ | 844/2480 [03:55<06:16, 4.34it/s] 34%|███▍ | 845/2480 [03:55<08:07, 3.36it/s] 34%|███▍ | 846/2480 [03:56<07:42, 3.54it/s] 34%|███▍ | 847/2480 [03:56<07:24, 3.68it/s] 34%|███▍ | 848/2480 [03:56<06:38, 4.09it/s] 34%|███▍ | 849/2480 [03:56<06:35, 4.12it/s] 34%|███▍ | 850/2480 [03:57<07:00, 3.87it/s] 34%|███▍ | 851/2480 [03:57<06:18, 4.30it/s] 34%|███▍ | 852/2480 [03:57<06:18, 4.31it/s] 34%|███▍ | 853/2480 [03:57<06:34, 4.13it/s] 34%|███▍ | 854/2480 [03:58<07:23, 3.67it/s] 34%|███▍ | 855/2480 [03:58<06:59, 3.88it/s] 35%|███▍ | 856/2480 [03:58<07:08, 3.79it/s] 35%|███▍ | 857/2480 [03:58<07:01, 3.85it/s] 35%|███▍ | 858/2480 [03:59<06:47, 3.98it/s] 35%|███▍ | 859/2480 [03:59<06:43, 4.02it/s] 35%|███▍ | 860/2480 [03:59<07:01, 3.84it/s] 35%|███▍ | 861/2480 [03:59<07:08, 3.78it/s] 35%|███▍ | 862/2480 [03:59<06:24, 4.21it/s] 35%|███▍ | 863/2480 [04:00<06:14, 4.32it/s] 35%|███▍ | 864/2480 [04:00<06:39, 4.04it/s] 35%|███▍ | 865/2480 [04:00<06:07, 4.39it/s] 35%|███▍ | 866/2480 [04:00<06:09, 4.37it/s] 35%|███▍ | 867/2480 [04:01<05:54, 4.55it/s] 35%|███▌ | 868/2480 [04:01<06:02, 4.45it/s] 35%|███▌ | 869/2480 [04:01<06:48, 3.94it/s] 35%|███▌ | 870/2480 [04:01<06:29, 4.13it/s] 35%|███▌ | 871/2480 [04:02<06:27, 4.15it/s] 35%|███▌ | 872/2480 [04:02<06:33, 4.09it/s] 35%|███▌ | 873/2480 [04:02<06:20, 4.22it/s] 35%|███▌ | 874/2480 [04:02<06:51, 3.91it/s] 35%|███▌ | 875/2480 [04:03<06:47, 3.94it/s] 35%|███▌ | 876/2480 [04:03<06:56, 3.85it/s] 35%|███▌ | 877/2480 [04:03<06:49, 3.91it/s] 35%|███▌ | 878/2480 [04:03<06:46, 3.94it/s] 35%|███▌ | 879/2480 [04:04<06:20, 4.21it/s] 35%|███▌ | 880/2480 [04:04<06:17, 4.24it/s] 36%|███▌ | 881/2480 [04:04<05:53, 4.52it/s] 36%|███▌ | 882/2480 [04:04<05:39, 4.70it/s] 36%|███▌ | 883/2480 [04:04<05:44, 4.64it/s] 36%|███▌ | 884/2480 [04:05<05:54, 4.50it/s] 36%|███▌ | 885/2480 [04:05<05:56, 4.48it/s] 36%|███▌ | 886/2480 [04:05<05:59, 4.43it/s] 36%|███▌ | 887/2480 [04:05<05:55, 4.49it/s] 36%|███▌ | 888/2480 [04:06<05:44, 4.62it/s] 36%|███▌ | 889/2480 [04:06<06:01, 4.41it/s] 36%|███▌ | 890/2480 [04:06<06:33, 4.04it/s] 36%|███▌ | 891/2480 [04:06<06:40, 3.97it/s] 36%|███▌ | 892/2480 [04:07<06:33, 4.04it/s] 36%|███▌ | 893/2480 [04:07<06:32, 4.04it/s] 36%|███▌ | 894/2480 [04:07<06:24, 4.12it/s] 36%|███▌ | 895/2480 [04:07<06:33, 4.03it/s] 36%|███▌ | 896/2480 [04:08<06:04, 4.34it/s] 36%|███▌ | 897/2480 [04:08<06:14, 4.22it/s] 36%|███▌ | 898/2480 [04:08<05:48, 4.54it/s] 36%|███▋ | 899/2480 [04:08<06:06, 4.31it/s] 36%|███▋ | 900/2480 [04:08<05:43, 4.59it/s] 36%|███▋ | 901/2480 [04:09<05:21, 4.91it/s] 36%|███▋ | 902/2480 [04:09<05:46, 4.56it/s] 36%|███▋ | 903/2480 [04:09<05:42, 4.61it/s] 36%|███▋ | 904/2480 [04:09<06:19, 4.15it/s] 36%|███▋ | 905/2480 [04:10<06:57, 3.77it/s] 37%|███▋ | 906/2480 [04:10<06:45, 3.88it/s] 37%|███▋ | 907/2480 [04:10<07:45, 3.38it/s] 37%|███▋ | 908/2480 [04:10<07:05, 3.70it/s] 37%|███▋ | 909/2480 [04:11<06:29, 4.03it/s] 37%|███▋ | 910/2480 [04:11<07:23, 3.54it/s] 37%|███▋ | 911/2480 [04:11<06:44, 3.88it/s] 37%|███▋ | 912/2480 [04:12<06:43, 3.89it/s] 37%|███▋ | 913/2480 [04:12<07:58, 3.28it/s] 37%|███▋ | 914/2480 [04:12<07:08, 3.66it/s] 37%|███▋ | 915/2480 [04:12<06:45, 3.86it/s] 37%|███▋ | 916/2480 [04:13<06:18, 4.13it/s] 37%|███▋ | 917/2480 [04:13<06:18, 4.13it/s] 37%|███▋ | 918/2480 [04:13<06:12, 4.19it/s] 37%|███▋ | 919/2480 [04:13<06:11, 4.21it/s] 37%|███▋ | 920/2480 [04:13<05:59, 4.34it/s] 37%|███▋ | 921/2480 [04:14<06:04, 4.28it/s] 37%|███▋ | 922/2480 [04:14<06:59, 3.72it/s] 37%|███▋ | 923/2480 [04:14<06:28, 4.01it/s] 37%|███▋ | 924/2480 [04:14<06:18, 4.11it/s] 37%|███▋ | 925/2480 [04:15<06:22, 4.07it/s] 37%|███▋ | 926/2480 [04:15<06:15, 4.14it/s] 37%|███▋ | 927/2480 [04:15<06:14, 4.15it/s] 37%|███▋ | 928/2480 [04:15<06:11, 4.18it/s] 37%|███▋ | 929/2480 [04:16<06:13, 4.15it/s] 38%|███▊ | 930/2480 [04:16<06:01, 4.29it/s] 38%|███▊ | 931/2480 [04:16<06:05, 4.23it/s] 38%|███▊ | 932/2480 [04:16<05:58, 4.32it/s] 38%|███▊ | 933/2480 [04:17<06:12, 4.16it/s] 38%|███▊ | 934/2480 [04:17<05:49, 4.43it/s] 38%|███▊ | 935/2480 [04:17<06:09, 4.18it/s] 38%|███▊ | 936/2480 [04:17<05:46, 4.46it/s] 38%|███▊ | 937/2480 [04:18<05:40, 4.54it/s] 38%|███▊ | 938/2480 [04:18<06:09, 4.18it/s] 38%|███▊ | 939/2480 [04:18<06:14, 4.12it/s] 38%|███▊ | 940/2480 [04:18<05:56, 4.32it/s] 38%|███▊ | 941/2480 [04:19<06:33, 3.91it/s] 38%|███▊ | 942/2480 [04:19<07:03, 3.63it/s] 38%|███▊ | 943/2480 [04:19<06:34, 3.90it/s] 38%|███▊ | 944/2480 [04:19<06:06, 4.19it/s] 38%|███▊ | 945/2480 [04:20<06:10, 4.14it/s] 38%|███▊ | 946/2480 [04:20<06:12, 4.11it/s] 38%|███▊ | 947/2480 [04:20<06:32, 3.90it/s] 38%|███▊ | 948/2480 [04:20<06:20, 4.03it/s] 38%|███▊ | 949/2480 [04:21<06:14, 4.08it/s] 38%|███▊ | 950/2480 [04:21<06:40, 3.82it/s] 38%|███▊ | 951/2480 [04:21<06:07, 4.17it/s] 38%|███▊ | 952/2480 [04:21<06:17, 4.05it/s] 38%|███▊ | 953/2480 [04:22<06:04, 4.19it/s] 38%|███▊ | 954/2480 [04:22<06:01, 4.23it/s] 39%|███▊ | 955/2480 [04:22<06:10, 4.11it/s] 39%|███▊ | 956/2480 [04:22<05:46, 4.40it/s] 39%|███▊ | 957/2480 [04:22<05:41, 4.46it/s] 39%|███▊ | 958/2480 [04:23<05:49, 4.36it/s] 39%|███▊ | 959/2480 [04:23<05:49, 4.35it/s] 39%|███▊ | 960/2480 [04:23<05:48, 4.36it/s] 39%|███▉ | 961/2480 [04:23<05:47, 4.37it/s] 39%|███▉ | 962/2480 [04:24<06:20, 3.99it/s] 39%|███▉ | 963/2480 [04:24<06:24, 3.95it/s] 39%|███▉ | 964/2480 [04:24<06:14, 4.05it/s] 39%|███▉ | 965/2480 [04:24<06:02, 4.18it/s] 39%|███▉ | 966/2480 [04:25<05:39, 4.45it/s] 39%|███▉ | 967/2480 [04:25<05:54, 4.26it/s] 39%|███▉ | 968/2480 [04:25<05:57, 4.23it/s] 39%|███▉ | 969/2480 [04:25<05:48, 4.33it/s] 39%|███▉ | 970/2480 [04:25<05:40, 4.44it/s] 39%|███▉ | 971/2480 [04:26<05:36, 4.48it/s] 39%|███▉ | 972/2480 [04:26<06:34, 3.82it/s] 39%|███▉ | 973/2480 [04:26<06:22, 3.94it/s] 39%|███▉ | 974/2480 [04:27<06:51, 3.66it/s] 39%|███▉ | 975/2480 [04:27<06:33, 3.83it/s] 39%|███▉ | 976/2480 [04:27<06:44, 3.72it/s] 39%|███▉ | 977/2480 [04:27<06:33, 3.82it/s] 39%|███▉ | 978/2480 [04:28<06:02, 4.14it/s] 39%|███▉ | 979/2480 [04:28<07:19, 3.42it/s] 40%|███▉ | 980/2480 [04:28<07:16, 3.44it/s] 40%|███▉ | 981/2480 [04:28<06:54, 3.61it/s] 40%|███▉ | 982/2480 [04:29<07:03, 3.54it/s] 40%|███▉ | 983/2480 [04:29<06:40, 3.74it/s] 40%|███▉ | 984/2480 [04:29<07:01, 3.55it/s] 40%|███▉ | 985/2480 [04:30<06:55, 3.60it/s] 40%|███▉ | 986/2480 [04:30<07:33, 3.29it/s] 40%|███▉ | 987/2480 [04:30<06:42, 3.71it/s] 40%|███▉ | 988/2480 [04:30<06:24, 3.88it/s] 40%|███▉ | 989/2480 [04:31<06:29, 3.83it/s] 40%|███▉ | 990/2480 [04:31<06:29, 3.83it/s] 40%|███▉ | 991/2480 [04:31<06:12, 4.00it/s] 40%|████ | 992/2480 [04:31<05:44, 4.32it/s][INFO|trainer.py:811] 2024-09-04 18:32:07,145 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:32:07,147 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:32:07,147 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:32:07,147 >> Batch size = 8 {'eval_loss': 0.2111387550830841, 'eval_precision': 0.6651017214397497, 'eval_recall': 0.6978653530377669, 'eval_f1': 0.6810897435897436, 'eval_accuracy': 0.9491802752735089, 'eval_runtime': 5.4844, 'eval_samples_per_second': 459.302, 'eval_steps_per_second': 57.435, 'epoch': 3.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-992 [INFO|configuration_utils.py:472] 2024-09-04 18:32:12,598 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-992/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:32:13,609 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-992/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:32:13,610 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-992/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:32:13,610 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-992/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:32:15,718 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:32:15,718 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 40%|████ | 993/2480 [04:40<1:10:26, 2.84s/it] 40%|████ | 994/2480 [04:41<51:08, 2.07s/it] 40%|████ | 995/2480 [04:41<37:20, 1.51s/it] 40%|████ | 996/2480 [04:41<28:03, 1.13s/it] 40%|████ | 997/2480 [04:41<21:05, 1.17it/s] 40%|████ | 998/2480 [04:41<16:24, 1.51it/s] 40%|████ | 999/2480 [04:42<13:18, 1.85it/s] 40%|████ | 1000/2480 [04:42<10:49, 2.28it/s] 40%|████ | 1000/2480 [04:42<10:49, 2.28it/s] 40%|████ | 1001/2480 [04:42<09:11, 2.68it/s] 40%|████ | 1002/2480 [04:42<08:09, 3.02it/s] 40%|████ | 1003/2480 [04:43<07:25, 3.32it/s] 40%|████ | 1004/2480 [04:43<06:57, 3.54it/s] 41%|████ | 1005/2480 [04:43<06:41, 3.68it/s] 41%|████ | 1006/2480 [04:43<06:06, 4.03it/s] 41%|████ | 1007/2480 [04:43<05:47, 4.24it/s] 41%|████ | 1008/2480 [04:44<05:43, 4.28it/s] 41%|████ | 1009/2480 [04:44<05:41, 4.31it/s] 41%|████ | 1010/2480 [04:44<06:34, 3.73it/s] 41%|████ | 1011/2480 [04:44<06:07, 4.00it/s] 41%|████ | 1012/2480 [04:45<05:42, 4.29it/s] 41%|████ | 1013/2480 [04:45<05:16, 4.64it/s] 41%|████ | 1014/2480 [04:45<05:08, 4.75it/s] 41%|████ | 1015/2480 [04:45<05:22, 4.55it/s] 41%|████ | 1016/2480 [04:45<05:26, 4.48it/s] 41%|████ | 1017/2480 [04:46<06:00, 4.06it/s] 41%|████ | 1018/2480 [04:46<05:34, 4.37it/s] 41%|████ | 1019/2480 [04:46<06:02, 4.03it/s] 41%|████ | 1020/2480 [04:46<05:45, 4.22it/s] 41%|████ | 1021/2480 [04:47<06:02, 4.02it/s] 41%|████ | 1022/2480 [04:47<06:29, 3.74it/s] 41%|████▏ | 1023/2480 [04:47<06:24, 3.79it/s] 41%|████▏ | 1024/2480 [04:48<06:31, 3.72it/s] 41%|████▏ | 1025/2480 [04:48<05:57, 4.07it/s] 41%|████▏ | 1026/2480 [04:48<05:43, 4.23it/s] 41%|████▏ | 1027/2480 [04:48<05:32, 4.37it/s] 41%|████▏ | 1028/2480 [04:48<05:12, 4.65it/s] 41%|████▏ | 1029/2480 [04:49<05:28, 4.42it/s] 42%|████▏ | 1030/2480 [04:49<05:06, 4.72it/s] 42%|████▏ | 1031/2480 [04:49<05:02, 4.79it/s] 42%|████▏ | 1032/2480 [04:49<05:45, 4.19it/s] 42%|████▏ | 1033/2480 [04:50<06:51, 3.52it/s] 42%|████▏ | 1034/2480 [04:50<06:51, 3.52it/s] 42%|████▏ | 1035/2480 [04:50<06:06, 3.95it/s] 42%|████▏ | 1036/2480 [04:50<05:35, 4.31it/s] 42%|████▏ | 1037/2480 [04:51<05:24, 4.44it/s] 42%|████▏ | 1038/2480 [04:51<05:18, 4.53it/s] 42%|████▏ | 1039/2480 [04:51<05:03, 4.74it/s] 42%|████▏ | 1040/2480 [04:51<05:06, 4.70it/s] 42%|████▏ | 1041/2480 [04:51<04:58, 4.82it/s] 42%|████▏ | 1042/2480 [04:52<05:03, 4.74it/s] 42%|████▏ | 1043/2480 [04:52<05:36, 4.27it/s] 42%|████▏ | 1044/2480 [04:52<05:36, 4.27it/s] 42%|████▏ | 1045/2480 [04:52<05:40, 4.22it/s] 42%|████▏ | 1046/2480 [04:53<05:27, 4.38it/s] 42%|████▏ | 1047/2480 [04:53<05:32, 4.31it/s] 42%|████▏ | 1048/2480 [04:53<05:15, 4.54it/s] 42%|████▏ | 1049/2480 [04:53<05:18, 4.49it/s] 42%|████▏ | 1050/2480 [04:54<05:27, 4.37it/s] 42%|████▏ | 1051/2480 [04:54<05:02, 4.73it/s] 42%|████▏ | 1052/2480 [04:54<04:51, 4.90it/s] 42%|████▏ | 1053/2480 [04:54<04:49, 4.93it/s] 42%|████▎ | 1054/2480 [04:54<04:52, 4.88it/s] 43%|████▎ | 1055/2480 [04:54<04:55, 4.83it/s] 43%|████▎ | 1056/2480 [04:55<05:07, 4.63it/s] 43%|████▎ | 1057/2480 [04:55<05:14, 4.52it/s] 43%|████▎ | 1058/2480 [04:55<06:23, 3.71it/s] 43%|████▎ | 1059/2480 [04:56<06:05, 3.89it/s] 43%|████▎ | 1060/2480 [04:56<06:02, 3.92it/s] 43%|████▎ | 1061/2480 [04:56<05:43, 4.13it/s] 43%|████▎ | 1062/2480 [04:56<06:03, 3.90it/s] 43%|████▎ | 1063/2480 [04:57<05:37, 4.19it/s] 43%|████▎ | 1064/2480 [04:57<06:05, 3.87it/s] 43%|████▎ | 1065/2480 [04:57<06:06, 3.86it/s] 43%|████▎ | 1066/2480 [04:58<07:21, 3.20it/s] 43%|████▎ | 1067/2480 [04:58<06:33, 3.59it/s] 43%|████▎ | 1068/2480 [04:58<06:02, 3.90it/s] 43%|████▎ | 1069/2480 [04:58<06:03, 3.88it/s] 43%|████▎ | 1070/2480 [04:59<06:46, 3.47it/s] 43%|████▎ | 1071/2480 [04:59<06:16, 3.74it/s] 43%|████▎ | 1072/2480 [04:59<06:06, 3.84it/s] 43%|████▎ | 1073/2480 [04:59<06:07, 3.83it/s] 43%|████▎ | 1074/2480 [04:59<05:51, 4.00it/s] 43%|████▎ | 1075/2480 [05:00<06:21, 3.68it/s] 43%|████▎ | 1076/2480 [05:00<05:49, 4.01it/s] 43%|████▎ | 1077/2480 [05:00<05:23, 4.34it/s] 43%|████▎ | 1078/2480 [05:00<05:51, 3.98it/s] 44%|████▎ | 1079/2480 [05:01<06:10, 3.78it/s] 44%|████▎ | 1080/2480 [05:01<06:08, 3.80it/s] 44%|████▎ | 1081/2480 [05:01<06:33, 3.56it/s] 44%|████▎ | 1082/2480 [05:02<06:57, 3.35it/s] 44%|████▎ | 1083/2480 [05:02<06:21, 3.66it/s] 44%|████▎ | 1084/2480 [05:02<06:07, 3.80it/s] 44%|████▍ | 1085/2480 [05:02<05:50, 3.98it/s] 44%|████▍ | 1086/2480 [05:03<05:37, 4.13it/s] 44%|████▍ | 1087/2480 [05:03<05:18, 4.38it/s] 44%|████▍ | 1088/2480 [05:03<05:14, 4.43it/s] 44%|████▍ | 1089/2480 [05:03<05:23, 4.30it/s] 44%|████▍ | 1090/2480 [05:04<05:20, 4.34it/s] 44%|████▍ | 1091/2480 [05:04<05:36, 4.13it/s] 44%|████▍ | 1092/2480 [05:04<05:37, 4.11it/s] 44%|████▍ | 1093/2480 [05:04<05:37, 4.11it/s] 44%|████▍ | 1094/2480 [05:05<05:52, 3.94it/s] 44%|████▍ | 1095/2480 [05:05<05:51, 3.94it/s] 44%|████▍ | 1096/2480 [05:05<05:39, 4.08it/s] 44%|████▍ | 1097/2480 [05:05<05:12, 4.43it/s] 44%|████▍ | 1098/2480 [05:05<05:04, 4.53it/s] 44%|████▍ | 1099/2480 [05:06<05:03, 4.55it/s] 44%|████▍ | 1100/2480 [05:06<05:42, 4.03it/s] 44%|████▍ | 1101/2480 [05:06<05:34, 4.13it/s] 44%|████▍ | 1102/2480 [05:06<05:23, 4.26it/s] 44%|████▍ | 1103/2480 [05:07<05:28, 4.19it/s] 45%|████▍ | 1104/2480 [05:07<05:30, 4.17it/s] 45%|████▍ | 1105/2480 [05:07<05:25, 4.23it/s] 45%|████▍ | 1106/2480 [05:07<05:07, 4.47it/s] 45%|████▍ | 1107/2480 [05:08<05:11, 4.40it/s] 45%|████▍ | 1108/2480 [05:08<05:12, 4.39it/s] 45%|████▍ | 1109/2480 [05:08<05:22, 4.25it/s] 45%|████▍ | 1110/2480 [05:08<05:33, 4.11it/s] 45%|████▍ | 1111/2480 [05:08<05:11, 4.40it/s] 45%|████▍ | 1112/2480 [05:09<05:30, 4.14it/s] 45%|████▍ | 1113/2480 [05:09<05:24, 4.21it/s] 45%|████▍ | 1114/2480 [05:09<05:02, 4.52it/s] 45%|████▍ | 1115/2480 [05:09<05:05, 4.47it/s] 45%|████▌ | 1116/2480 [05:10<04:57, 4.59it/s] 45%|████▌ | 1117/2480 [05:10<05:00, 4.54it/s] 45%|████▌ | 1118/2480 [05:10<04:56, 4.60it/s] 45%|████▌ | 1119/2480 [05:10<05:40, 3.99it/s] 45%|████▌ | 1120/2480 [05:11<05:26, 4.16it/s] 45%|████▌ | 1121/2480 [05:11<05:30, 4.11it/s] 45%|████▌ | 1122/2480 [05:11<06:13, 3.63it/s] 45%|████▌ | 1123/2480 [05:11<05:59, 3.78it/s] 45%|████▌ | 1124/2480 [05:12<05:52, 3.85it/s] 45%|████▌ | 1125/2480 [05:12<05:16, 4.28it/s] 45%|████▌ | 1126/2480 [05:12<06:16, 3.60it/s] 45%|████▌ | 1127/2480 [05:12<06:03, 3.73it/s] 45%|████▌ | 1128/2480 [05:13<05:38, 3.99it/s] 46%|████▌ | 1129/2480 [05:13<05:20, 4.21it/s] 46%|████▌ | 1130/2480 [05:13<05:09, 4.36it/s] 46%|████▌ | 1131/2480 [05:13<05:12, 4.32it/s] 46%|████▌ | 1132/2480 [05:14<05:44, 3.91it/s] 46%|████▌ | 1133/2480 [05:14<05:21, 4.18it/s] 46%|████▌ | 1134/2480 [05:14<05:51, 3.83it/s] 46%|████▌ | 1135/2480 [05:14<05:29, 4.08it/s] 46%|████▌ | 1136/2480 [05:15<05:44, 3.90it/s] 46%|████▌ | 1137/2480 [05:15<06:29, 3.45it/s] 46%|████▌ | 1138/2480 [05:15<06:00, 3.72it/s] 46%|████▌ | 1139/2480 [05:15<05:44, 3.89it/s] 46%|████▌ | 1140/2480 [05:16<05:32, 4.03it/s] 46%|████▌ | 1141/2480 [05:16<05:30, 4.05it/s] 46%|████▌ | 1142/2480 [05:16<05:31, 4.03it/s] 46%|████▌ | 1143/2480 [05:16<05:06, 4.37it/s] 46%|████▌ | 1144/2480 [05:17<04:54, 4.54it/s] 46%|████▌ | 1145/2480 [05:17<04:59, 4.45it/s] 46%|████▌ | 1146/2480 [05:17<05:06, 4.35it/s] 46%|████▋ | 1147/2480 [05:17<04:44, 4.69it/s] 46%|████▋ | 1148/2480 [05:17<04:53, 4.53it/s] 46%|████▋ | 1149/2480 [05:18<06:01, 3.68it/s] 46%|████▋ | 1150/2480 [05:18<05:34, 3.98it/s] 46%|████▋ | 1151/2480 [05:18<05:34, 3.97it/s] 46%|████▋ | 1152/2480 [05:19<06:11, 3.58it/s] 46%|████▋ | 1153/2480 [05:19<06:05, 3.63it/s] 47%|████▋ | 1154/2480 [05:19<05:31, 4.00it/s] 47%|████▋ | 1155/2480 [05:19<05:13, 4.22it/s] 47%|████▋ | 1156/2480 [05:20<05:51, 3.77it/s] 47%|████▋ | 1157/2480 [05:20<05:18, 4.15it/s] 47%|████▋ | 1158/2480 [05:20<05:33, 3.97it/s] 47%|████▋ | 1159/2480 [05:20<05:26, 4.05it/s] 47%|████▋ | 1160/2480 [05:21<05:14, 4.20it/s] 47%|████▋ | 1161/2480 [05:21<05:07, 4.30it/s] 47%|████▋ | 1162/2480 [05:21<05:14, 4.19it/s] 47%|████▋ | 1163/2480 [05:21<05:18, 4.14it/s] 47%|████▋ | 1164/2480 [05:21<04:57, 4.42it/s] 47%|████▋ | 1165/2480 [05:22<04:45, 4.60it/s] 47%|████▋ | 1166/2480 [05:22<04:39, 4.71it/s] 47%|████▋ | 1167/2480 [05:22<04:33, 4.80it/s] 47%|████▋ | 1168/2480 [05:22<04:42, 4.64it/s] 47%|████▋ | 1169/2480 [05:23<04:55, 4.44it/s] 47%|████▋ | 1170/2480 [05:23<04:56, 4.41it/s] 47%|████▋ | 1171/2480 [05:23<05:42, 3.82it/s] 47%|████▋ | 1172/2480 [05:23<05:22, 4.06it/s] 47%|████▋ | 1173/2480 [05:24<05:32, 3.93it/s] 47%|████▋ | 1174/2480 [05:24<05:34, 3.91it/s] 47%|████▋ | 1175/2480 [05:24<05:25, 4.01it/s] 47%|████▋ | 1176/2480 [05:24<05:20, 4.07it/s] 47%|████▋ | 1177/2480 [05:25<05:20, 4.07it/s] 48%|████▊ | 1178/2480 [05:25<05:13, 4.16it/s] 48%|████▊ | 1179/2480 [05:25<04:58, 4.35it/s] 48%|████▊ | 1180/2480 [05:26<07:16, 2.98it/s] 48%|████▊ | 1181/2480 [05:26<07:03, 3.06it/s] 48%|████▊ | 1182/2480 [05:26<06:30, 3.33it/s] 48%|████▊ | 1183/2480 [05:26<06:25, 3.37it/s] 48%|████▊ | 1184/2480 [05:27<05:55, 3.65it/s] 48%|████▊ | 1185/2480 [05:27<05:54, 3.66it/s] 48%|████▊ | 1186/2480 [05:27<05:21, 4.02it/s] 48%|████▊ | 1187/2480 [05:27<05:50, 3.68it/s] 48%|████▊ | 1188/2480 [05:28<05:40, 3.79it/s] 48%|████▊ | 1189/2480 [05:28<05:13, 4.12it/s] 48%|████▊ | 1190/2480 [05:28<05:14, 4.11it/s] 48%|████▊ | 1191/2480 [05:28<05:08, 4.17it/s] 48%|████▊ | 1192/2480 [05:29<05:17, 4.05it/s] 48%|████▊ | 1193/2480 [05:29<04:51, 4.41it/s] 48%|████▊ | 1194/2480 [05:29<04:55, 4.36it/s] 48%|████▊ | 1195/2480 [05:29<04:52, 4.39it/s] 48%|████▊ | 1196/2480 [05:29<04:54, 4.35it/s] 48%|████▊ | 1197/2480 [05:30<05:36, 3.82it/s] 48%|████▊ | 1198/2480 [05:30<05:10, 4.12it/s] 48%|████▊ | 1199/2480 [05:30<05:15, 4.06it/s] 48%|████▊ | 1200/2480 [05:31<05:29, 3.88it/s] 48%|████▊ | 1201/2480 [05:31<05:19, 4.00it/s] 48%|████▊ | 1202/2480 [05:31<05:08, 4.15it/s] 49%|████▊ | 1203/2480 [05:31<05:03, 4.21it/s] 49%|████▊ | 1204/2480 [05:31<04:52, 4.36it/s] 49%|████▊ | 1205/2480 [05:32<04:49, 4.40it/s] 49%|████▊ | 1206/2480 [05:32<05:13, 4.06it/s] 49%|████▊ | 1207/2480 [05:32<05:38, 3.76it/s] 49%|████▊ | 1208/2480 [05:32<05:17, 4.01it/s] 49%|████▉ | 1209/2480 [05:33<06:22, 3.33it/s] 49%|████▉ | 1210/2480 [05:33<05:49, 3.63it/s] 49%|████▉ | 1211/2480 [05:33<05:53, 3.59it/s] 49%|████▉ | 1212/2480 [05:34<05:41, 3.72it/s] 49%|████▉ | 1213/2480 [05:34<05:21, 3.94it/s] 49%|████▉ | 1214/2480 [05:34<05:00, 4.21it/s] 49%|████▉ | 1215/2480 [05:34<05:29, 3.84it/s] 49%|████▉ | 1216/2480 [05:35<05:03, 4.16it/s] 49%|████▉ | 1217/2480 [05:35<05:00, 4.20it/s] 49%|████▉ | 1218/2480 [05:35<04:52, 4.32it/s] 49%|████▉ | 1219/2480 [05:35<05:12, 4.03it/s] 49%|████▉ | 1220/2480 [05:36<05:13, 4.02it/s] 49%|████▉ | 1221/2480 [05:36<05:25, 3.87it/s] 49%|████▉ | 1222/2480 [05:36<06:00, 3.49it/s] 49%|████▉ | 1223/2480 [05:36<05:24, 3.87it/s] 49%|████▉ | 1224/2480 [05:37<05:31, 3.79it/s] 49%|████▉ | 1225/2480 [05:37<05:05, 4.10it/s] 49%|████▉ | 1226/2480 [05:37<05:02, 4.15it/s] 49%|████▉ | 1227/2480 [05:37<04:57, 4.21it/s] 50%|████▉ | 1228/2480 [05:38<04:54, 4.26it/s] 50%|████▉ | 1229/2480 [05:38<04:55, 4.23it/s] 50%|████▉ | 1230/2480 [05:38<05:55, 3.51it/s] 50%|████▉ | 1231/2480 [05:38<05:13, 3.98it/s] 50%|████▉ | 1232/2480 [05:39<05:05, 4.08it/s] 50%|████▉ | 1233/2480 [05:39<04:48, 4.32it/s] 50%|████▉ | 1234/2480 [05:39<04:58, 4.17it/s] 50%|████▉ | 1235/2480 [05:39<05:01, 4.13it/s] 50%|████▉ | 1236/2480 [05:40<05:08, 4.04it/s] 50%|████▉ | 1237/2480 [05:40<05:21, 3.87it/s] 50%|████▉ | 1238/2480 [05:40<05:51, 3.53it/s] 50%|████▉ | 1239/2480 [05:40<05:23, 3.84it/s] 50%|█████ | 1240/2480 [05:41<04:36, 4.48it/s][INFO|trainer.py:811] 2024-09-04 18:33:16,340 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:33:16,343 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:33:16,343 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:33:16,343 >> Batch size = 8 {'eval_loss': 0.25230270624160767, 'eval_precision': 0.6692386831275721, 'eval_recall': 0.7120963327859879, 'eval_f1': 0.6900026518164943, 'eval_accuracy': 0.9488434020982386, 'eval_runtime': 5.4481, 'eval_samples_per_second': 462.36, 'eval_steps_per_second': 57.818, 'epoch': 4.0} {'loss': 0.026, 'grad_norm': 0.7998089790344238, 'learning_rate': 2.9838709677419357e-05, 'epoch': 4.03} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1240 [INFO|configuration_utils.py:472] 2024-09-04 18:33:21,999 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1240/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:33:23,002 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1240/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:33:23,003 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1240/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:33:23,004 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1240/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:33:25,081 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:33:25,082 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 50%|█████ | 1241/2480 [05:50<59:12, 2.87s/it] 50%|█████ | 1242/2480 [05:50<42:41, 2.07s/it] 50%|█████ | 1243/2480 [05:50<31:28, 1.53s/it] 50%|█████ | 1244/2480 [05:50<23:23, 1.14s/it] 50%|█████ | 1245/2480 [05:50<17:30, 1.18it/s] 50%|█████ | 1246/2480 [05:51<13:34, 1.52it/s] 50%|█████ | 1247/2480 [05:51<10:56, 1.88it/s] 50%|█████ | 1248/2480 [05:51<09:05, 2.26it/s] 50%|█████ | 1249/2480 [05:51<07:42, 2.66it/s] 50%|█████ | 1250/2480 [05:52<06:49, 3.00it/s] 50%|█████ | 1251/2480 [05:52<06:29, 3.15it/s] 50%|█████ | 1252/2480 [05:52<06:04, 3.37it/s] 51%|█████ | 1253/2480 [05:52<05:56, 3.45it/s] 51%|█████ | 1254/2480 [05:53<05:42, 3.58it/s] 51%|█████ | 1255/2480 [05:53<05:24, 3.77it/s] 51%|█████ | 1256/2480 [05:53<05:37, 3.63it/s] 51%|█████ | 1257/2480 [05:53<05:26, 3.74it/s] 51%|█████ | 1258/2480 [05:54<05:23, 3.77it/s] 51%|█████ | 1259/2480 [05:54<05:37, 3.62it/s] 51%|█████ | 1260/2480 [05:54<05:02, 4.03it/s] 51%|█████ | 1261/2480 [05:55<05:36, 3.63it/s] 51%|█████ | 1262/2480 [05:55<05:17, 3.83it/s] 51%|█████ | 1263/2480 [05:55<05:06, 3.96it/s] 51%|█████ | 1264/2480 [05:55<05:16, 3.85it/s] 51%|█████ | 1265/2480 [05:55<04:59, 4.06it/s] 51%|█████ | 1266/2480 [05:56<04:59, 4.05it/s] 51%|█████ | 1267/2480 [05:56<04:47, 4.22it/s] 51%|█████ | 1268/2480 [05:56<04:55, 4.10it/s] 51%|█████ | 1269/2480 [05:57<06:17, 3.21it/s] 51%|█████ | 1270/2480 [05:57<05:40, 3.56it/s] 51%|█████▏ | 1271/2480 [05:57<05:26, 3.71it/s] 51%|█████▏ | 1272/2480 [05:57<05:16, 3.82it/s] 51%|█████▏ | 1273/2480 [05:58<04:51, 4.15it/s] 51%|█████▏ | 1274/2480 [05:58<04:55, 4.08it/s] 51%|█████▏ | 1275/2480 [05:58<04:44, 4.24it/s] 51%|█████▏ | 1276/2480 [05:58<04:53, 4.10it/s] 51%|█████▏ | 1277/2480 [05:58<04:43, 4.24it/s] 52%|█████▏ | 1278/2480 [05:59<04:45, 4.21it/s] 52%|█████▏ | 1279/2480 [05:59<04:27, 4.49it/s] 52%|█████▏ | 1280/2480 [05:59<04:24, 4.53it/s] 52%|█████▏ | 1281/2480 [05:59<04:11, 4.77it/s] 52%|█████▏ | 1282/2480 [06:00<04:12, 4.74it/s] 52%|█████▏ | 1283/2480 [06:00<04:02, 4.93it/s] 52%|█████▏ | 1284/2480 [06:00<04:21, 4.57it/s] 52%|█████▏ | 1285/2480 [06:00<04:29, 4.44it/s] 52%|█████▏ | 1286/2480 [06:00<04:32, 4.39it/s] 52%|█████▏ | 1287/2480 [06:01<04:24, 4.50it/s] 52%|█████▏ | 1288/2480 [06:01<04:27, 4.46it/s] 52%|█████▏ | 1289/2480 [06:01<04:29, 4.41it/s] 52%|█████▏ | 1290/2480 [06:01<04:28, 4.44it/s] 52%|█████▏ | 1291/2480 [06:02<04:28, 4.42it/s] 52%|█████▏ | 1292/2480 [06:02<04:19, 4.59it/s] 52%|█████▏ | 1293/2480 [06:02<04:14, 4.67it/s] 52%|█████▏ | 1294/2480 [06:02<04:21, 4.54it/s] 52%|█████▏ | 1295/2480 [06:03<05:07, 3.86it/s] 52%|█████▏ | 1296/2480 [06:03<05:12, 3.79it/s] 52%|█████▏ | 1297/2480 [06:03<04:47, 4.11it/s] 52%|█████▏ | 1298/2480 [06:03<04:53, 4.02it/s] 52%|█████▏ | 1299/2480 [06:04<04:53, 4.03it/s] 52%|█████▏ | 1300/2480 [06:04<05:39, 3.47it/s] 52%|█████▏ | 1301/2480 [06:04<05:08, 3.82it/s] 52%|█████▎ | 1302/2480 [06:04<05:14, 3.75it/s] 53%|█████▎ | 1303/2480 [06:05<04:57, 3.96it/s] 53%|█████▎ | 1304/2480 [06:05<04:48, 4.07it/s] 53%|█████▎ | 1305/2480 [06:05<04:36, 4.25it/s] 53%|█████▎ | 1306/2480 [06:05<04:21, 4.49it/s] 53%|█████▎ | 1307/2480 [06:05<04:17, 4.56it/s] 53%|█████▎ | 1308/2480 [06:06<04:18, 4.54it/s] 53%|█████▎ | 1309/2480 [06:06<04:15, 4.59it/s] 53%|█████▎ | 1310/2480 [06:06<04:30, 4.32it/s] 53%|█████▎ | 1311/2480 [06:06<04:56, 3.95it/s] 53%|█████▎ | 1312/2480 [06:07<04:52, 4.00it/s] 53%|█████▎ | 1313/2480 [06:07<04:50, 4.02it/s] 53%|█████▎ | 1314/2480 [06:07<04:45, 4.08it/s] 53%|█████▎ | 1315/2480 [06:07<04:32, 4.27it/s] 53%|█████▎ | 1316/2480 [06:08<04:29, 4.33it/s] 53%|█████▎ | 1317/2480 [06:08<04:34, 4.24it/s] 53%|█████▎ | 1318/2480 [06:08<04:36, 4.20it/s] 53%|█████▎ | 1319/2480 [06:08<04:32, 4.27it/s] 53%|█████▎ | 1320/2480 [06:09<04:49, 4.00it/s] 53%|█████▎ | 1321/2480 [06:09<04:30, 4.29it/s] 53%|█████▎ | 1322/2480 [06:09<04:31, 4.27it/s] 53%|█████▎ | 1323/2480 [06:09<04:26, 4.34it/s] 53%|█████▎ | 1324/2480 [06:09<04:17, 4.50it/s] 53%|█████▎ | 1325/2480 [06:10<04:20, 4.44it/s] 53%|█████▎ | 1326/2480 [06:10<04:29, 4.28it/s] 54%|█████▎ | 1327/2480 [06:10<04:27, 4.30it/s] 54%|█████▎ | 1328/2480 [06:10<04:21, 4.41it/s] 54%|█████▎ | 1329/2480 [06:11<04:17, 4.47it/s] 54%|█████▎ | 1330/2480 [06:11<04:06, 4.67it/s] 54%|█████▎ | 1331/2480 [06:11<04:01, 4.75it/s] 54%|█████▎ | 1332/2480 [06:11<04:08, 4.61it/s] 54%|█████▍ | 1333/2480 [06:11<04:19, 4.41it/s] 54%|█████▍ | 1334/2480 [06:12<04:32, 4.20it/s] 54%|█████▍ | 1335/2480 [06:12<04:23, 4.35it/s] 54%|█████▍ | 1336/2480 [06:12<04:20, 4.38it/s] 54%|█████▍ | 1337/2480 [06:12<04:18, 4.42it/s] 54%|█████▍ | 1338/2480 [06:13<04:11, 4.54it/s] 54%|█████▍ | 1339/2480 [06:13<04:32, 4.19it/s] 54%|█████▍ | 1340/2480 [06:13<04:43, 4.02it/s] 54%|█████▍ | 1341/2480 [06:13<04:37, 4.11it/s] 54%|█████▍ | 1342/2480 [06:14<04:32, 4.17it/s] 54%|█████▍ | 1343/2480 [06:14<05:09, 3.68it/s] 54%|█████▍ | 1344/2480 [06:14<04:41, 4.03it/s] 54%|█████▍ | 1345/2480 [06:14<04:42, 4.02it/s] 54%|█████▍ | 1346/2480 [06:15<04:52, 3.87it/s] 54%|█████▍ | 1347/2480 [06:15<04:37, 4.08it/s] 54%|█████▍ | 1348/2480 [06:15<04:43, 4.00it/s] 54%|█████▍ | 1349/2480 [06:15<04:24, 4.27it/s] 54%|█████▍ | 1350/2480 [06:16<04:46, 3.94it/s] 54%|█████▍ | 1351/2480 [06:16<04:44, 3.97it/s] 55%|█████▍ | 1352/2480 [06:16<05:06, 3.67it/s] 55%|█████▍ | 1353/2480 [06:16<04:48, 3.91it/s] 55%|█████▍ | 1354/2480 [06:17<04:29, 4.18it/s] 55%|█████▍ | 1355/2480 [06:17<04:56, 3.79it/s] 55%|█████▍ | 1356/2480 [06:17<04:44, 3.96it/s] 55%|█████▍ | 1357/2480 [06:17<04:23, 4.26it/s] 55%|█████▍ | 1358/2480 [06:18<04:13, 4.43it/s] 55%|█████▍ | 1359/2480 [06:18<04:06, 4.55it/s] 55%|█████▍ | 1360/2480 [06:18<03:56, 4.73it/s] 55%|█████▍ | 1361/2480 [06:18<03:44, 4.98it/s] 55%|█████▍ | 1362/2480 [06:18<03:48, 4.89it/s] 55%|█████▍ | 1363/2480 [06:19<03:48, 4.89it/s] 55%|█████▌ | 1364/2480 [06:19<04:28, 4.15it/s] 55%|█████▌ | 1365/2480 [06:19<04:31, 4.11it/s] 55%|█████▌ | 1366/2480 [06:19<04:13, 4.40it/s] 55%|█████▌ | 1367/2480 [06:20<04:13, 4.40it/s] 55%|█████▌ | 1368/2480 [06:20<04:03, 4.56it/s] 55%|█████▌ | 1369/2480 [06:20<04:00, 4.63it/s] 55%|█████▌ | 1370/2480 [06:20<03:44, 4.94it/s] 55%|█████▌ | 1371/2480 [06:21<04:40, 3.95it/s] 55%|█████▌ | 1372/2480 [06:21<04:45, 3.88it/s] 55%|█████▌ | 1373/2480 [06:21<04:56, 3.74it/s] 55%|█████▌ | 1374/2480 [06:21<05:00, 3.68it/s] 55%|█████▌ | 1375/2480 [06:22<04:36, 4.00it/s] 55%|█████▌ | 1376/2480 [06:22<04:43, 3.89it/s] 56%|█████▌ | 1377/2480 [06:22<04:29, 4.09it/s] 56%|█████▌ | 1378/2480 [06:22<04:11, 4.39it/s] 56%|█████▌ | 1379/2480 [06:22<04:13, 4.34it/s] 56%|█████▌ | 1380/2480 [06:23<03:52, 4.73it/s] 56%|█████▌ | 1381/2480 [06:23<04:28, 4.10it/s] 56%|█████▌ | 1382/2480 [06:23<04:24, 4.16it/s] 56%|█████▌ | 1383/2480 [06:23<04:12, 4.34it/s] 56%|█████▌ | 1384/2480 [06:24<04:19, 4.22it/s] 56%|█████▌ | 1385/2480 [06:24<04:11, 4.36it/s] 56%|█████▌ | 1386/2480 [06:24<04:01, 4.53it/s] 56%|█████▌ | 1387/2480 [06:24<04:43, 3.86it/s] 56%|█████▌ | 1388/2480 [06:25<04:20, 4.18it/s] 56%|█████▌ | 1389/2480 [06:25<04:10, 4.36it/s] 56%|█████▌ | 1390/2480 [06:25<04:16, 4.25it/s] 56%|█████▌ | 1391/2480 [06:25<04:31, 4.01it/s] 56%|█████▌ | 1392/2480 [06:26<04:25, 4.10it/s] 56%|█████▌ | 1393/2480 [06:26<04:34, 3.96it/s] 56%|█████▌ | 1394/2480 [06:26<04:25, 4.09it/s] 56%|█████▋ | 1395/2480 [06:26<05:06, 3.54it/s] 56%|█████▋ | 1396/2480 [06:27<04:59, 3.62it/s] 56%|█████▋ | 1397/2480 [06:27<05:02, 3.58it/s] 56%|█████▋ | 1398/2480 [06:27<05:10, 3.48it/s] 56%|█████▋ | 1399/2480 [06:28<05:47, 3.11it/s] 56%|█████▋ | 1400/2480 [06:28<05:41, 3.16it/s] 56%|█████▋ | 1401/2480 [06:28<06:07, 2.94it/s] 57%|█████▋ | 1402/2480 [06:29<06:14, 2.88it/s] 57%|█████▋ | 1403/2480 [06:29<06:37, 2.71it/s] 57%|█████▋ | 1404/2480 [06:29<05:40, 3.16it/s] 57%|█████▋ | 1405/2480 [06:30<05:00, 3.58it/s] 57%|█████▋ | 1406/2480 [06:30<04:30, 3.98it/s] 57%|█████▋ | 1407/2480 [06:30<04:08, 4.31it/s] 57%|█████▋ | 1408/2480 [06:30<04:50, 3.69it/s] 57%|█████▋ | 1409/2480 [06:31<04:30, 3.95it/s] 57%|█████▋ | 1410/2480 [06:31<04:25, 4.02it/s] 57%|█████▋ | 1411/2480 [06:31<04:07, 4.31it/s] 57%|█████▋ | 1412/2480 [06:31<04:11, 4.25it/s] 57%|█████▋ | 1413/2480 [06:32<04:35, 3.88it/s] 57%|█████▋ | 1414/2480 [06:32<04:26, 4.00it/s] 57%|█████▋ | 1415/2480 [06:32<04:14, 4.19it/s] 57%|█████▋ | 1416/2480 [06:32<04:02, 4.38it/s] 57%|█████▋ | 1417/2480 [06:32<04:10, 4.25it/s] 57%|█████▋ | 1418/2480 [06:33<04:14, 4.18it/s] 57%|█████▋ | 1419/2480 [06:33<04:46, 3.71it/s] 57%|█████▋ | 1420/2480 [06:33<04:29, 3.93it/s] 57%|█████▋ | 1421/2480 [06:34<04:37, 3.82it/s] 57%|█████▋ | 1422/2480 [06:34<04:31, 3.90it/s] 57%|█████▋ | 1423/2480 [06:34<04:27, 3.95it/s] 57%|█████▋ | 1424/2480 [06:34<04:32, 3.87it/s] 57%|█████▋ | 1425/2480 [06:34<04:20, 4.05it/s] 57%|█████▊ | 1426/2480 [06:35<04:20, 4.04it/s] 58%|█████▊ | 1427/2480 [06:35<04:09, 4.22it/s] 58%|█████▊ | 1428/2480 [06:35<04:10, 4.20it/s] 58%|█████▊ | 1429/2480 [06:36<04:41, 3.74it/s] 58%|█████▊ | 1430/2480 [06:36<04:34, 3.82it/s] 58%|█████▊ | 1431/2480 [06:36<04:25, 3.96it/s] 58%|█████▊ | 1432/2480 [06:36<04:15, 4.10it/s] 58%|█████▊ | 1433/2480 [06:36<04:14, 4.11it/s] 58%|█████▊ | 1434/2480 [06:37<04:40, 3.73it/s] 58%|█████▊ | 1435/2480 [06:37<04:24, 3.95it/s] 58%|█████▊ | 1436/2480 [06:37<04:13, 4.11it/s] 58%|█████▊ | 1437/2480 [06:37<03:57, 4.39it/s] 58%|█████▊ | 1438/2480 [06:38<04:20, 4.00it/s] 58%|█████▊ | 1439/2480 [06:38<04:17, 4.04it/s] 58%|█████▊ | 1440/2480 [06:38<04:10, 4.16it/s] 58%|█████▊ | 1441/2480 [06:38<04:05, 4.22it/s] 58%|█████▊ | 1442/2480 [06:39<03:58, 4.35it/s] 58%|█████▊ | 1443/2480 [06:39<03:51, 4.48it/s] 58%|█████▊ | 1444/2480 [06:39<04:22, 3.95it/s] 58%|█████▊ | 1445/2480 [06:39<04:14, 4.06it/s] 58%|█████▊ | 1446/2480 [06:40<03:55, 4.39it/s] 58%|█████▊ | 1447/2480 [06:40<04:05, 4.20it/s] 58%|█████▊ | 1448/2480 [06:40<03:59, 4.31it/s] 58%|█████▊ | 1449/2480 [06:40<04:19, 3.97it/s] 58%|█████▊ | 1450/2480 [06:41<03:58, 4.32it/s] 59%|█████▊ | 1451/2480 [06:41<04:16, 4.01it/s] 59%|█████▊ | 1452/2480 [06:41<04:08, 4.13it/s] 59%|█████▊ | 1453/2480 [06:41<04:10, 4.10it/s] 59%|█████▊ | 1454/2480 [06:42<04:22, 3.90it/s] 59%|█████▊ | 1455/2480 [06:42<04:20, 3.93it/s] 59%|█████▊ | 1456/2480 [06:42<04:31, 3.78it/s] 59%|█████▉ | 1457/2480 [06:42<04:46, 3.58it/s] 59%|█████▉ | 1458/2480 [06:43<04:21, 3.91it/s] 59%|█████▉ | 1459/2480 [06:43<04:13, 4.02it/s] 59%|█████▉ | 1460/2480 [06:43<04:18, 3.95it/s] 59%|█████▉ | 1461/2480 [06:43<04:23, 3.87it/s] 59%|█████▉ | 1462/2480 [06:44<04:17, 3.96it/s] 59%|█████▉ | 1463/2480 [06:44<04:19, 3.91it/s] 59%|█████▉ | 1464/2480 [06:44<03:54, 4.32it/s] 59%|█████▉ | 1465/2480 [06:44<03:57, 4.27it/s] 59%|█████▉ | 1466/2480 [06:45<03:41, 4.58it/s] 59%|█████▉ | 1467/2480 [06:45<04:10, 4.05it/s] 59%|█████▉ | 1468/2480 [06:45<04:01, 4.19it/s] 59%|█████▉ | 1469/2480 [06:45<03:53, 4.32it/s] 59%|█████▉ | 1470/2480 [06:45<03:50, 4.38it/s] 59%|█████▉ | 1471/2480 [06:46<03:45, 4.48it/s] 59%|█████▉ | 1472/2480 [06:46<04:06, 4.09it/s] 59%|█████▉ | 1473/2480 [06:46<04:05, 4.10it/s] 59%|█████▉ | 1474/2480 [06:47<04:20, 3.86it/s] 59%|█████▉ | 1475/2480 [06:47<04:04, 4.10it/s] 60%|█████▉ | 1476/2480 [06:47<03:56, 4.25it/s] 60%|█████▉ | 1477/2480 [06:47<04:52, 3.43it/s] 60%|█████▉ | 1478/2480 [06:48<04:46, 3.50it/s] 60%|█████▉ | 1479/2480 [06:48<04:28, 3.72it/s] 60%|█████▉ | 1480/2480 [06:48<04:13, 3.94it/s] 60%|█████▉ | 1481/2480 [06:48<04:14, 3.92it/s] 60%|█████▉ | 1482/2480 [06:49<05:13, 3.18it/s] 60%|█████▉ | 1483/2480 [06:49<04:39, 3.56it/s] 60%|█████▉ | 1484/2480 [06:49<04:23, 3.78it/s] 60%|█████▉ | 1485/2480 [06:49<04:18, 3.85it/s] 60%|█████▉ | 1486/2480 [06:50<04:19, 3.83it/s] 60%|█████▉ | 1487/2480 [06:50<04:45, 3.48it/s] 60%|██████ | 1488/2480 [06:50<04:31, 3.66it/s][INFO|trainer.py:811] 2024-09-04 18:34:26,140 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:34:26,142 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:34:26,142 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:34:26,142 >> Batch size = 8 {'eval_loss': 0.27709877490997314, 'eval_precision': 0.6584133400707428, 'eval_recall': 0.7131910235358512, 'eval_f1': 0.6847083552285864, 'eval_accuracy': 0.9490840257948603, 'eval_runtime': 5.6532, 'eval_samples_per_second': 445.585, 'eval_steps_per_second': 55.72, 'epoch': 5.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1488 [INFO|configuration_utils.py:472] 2024-09-04 18:34:31,600 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1488/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:34:32,617 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1488/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:34:32,618 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1488/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:34:32,619 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1488/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:34:34,750 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:34:34,750 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 60%|██████ | 1489/2480 [06:59<47:45, 2.89s/it] 60%|██████ | 1490/2480 [07:00<34:39, 2.10s/it] 60%|██████ | 1491/2480 [07:00<25:27, 1.54s/it] 60%|██████ | 1492/2480 [07:00<18:59, 1.15s/it] 60%|██████ | 1493/2480 [07:00<14:41, 1.12it/s] 60%|██████ | 1494/2480 [07:01<11:50, 1.39it/s] 60%|██████ | 1495/2480 [07:01<09:39, 1.70it/s] 60%|██████ | 1496/2480 [07:01<07:44, 2.12it/s] 60%|██████ | 1497/2480 [07:01<06:35, 2.49it/s] 60%|██████ | 1498/2480 [07:02<05:47, 2.82it/s] 60%|██████ | 1499/2480 [07:02<05:11, 3.14it/s] 60%|██████ | 1500/2480 [07:02<04:53, 3.33it/s] 60%|██████ | 1500/2480 [07:03<04:53, 3.33it/s] 61%|██████ | 1501/2480 [07:04<10:10, 1.60it/s] 61%|██████ | 1502/2480 [07:04<08:14, 1.98it/s] 61%|██████ | 1503/2480 [07:04<07:05, 2.29it/s] 61%|██████ | 1504/2480 [07:04<06:05, 2.67it/s] 61%|██████ | 1505/2480 [07:04<05:22, 3.03it/s] 61%|██████ | 1506/2480 [07:05<04:44, 3.42it/s] 61%|██████ | 1507/2480 [07:05<04:32, 3.57it/s] 61%|██████ | 1508/2480 [07:05<04:33, 3.55it/s] 61%|██████ | 1509/2480 [07:05<04:17, 3.77it/s] 61%|██████ | 1510/2480 [07:06<04:25, 3.65it/s] 61%|██████ | 1511/2480 [07:06<03:58, 4.06it/s] 61%|██████ | 1512/2480 [07:06<04:27, 3.62it/s] 61%|██████ | 1513/2480 [07:07<04:48, 3.35it/s] 61%|██████ | 1514/2480 [07:07<04:43, 3.41it/s] 61%|██████ | 1515/2480 [07:07<04:37, 3.48it/s] 61%|██████ | 1516/2480 [07:07<04:27, 3.60it/s] 61%|██████ | 1517/2480 [07:08<04:02, 3.97it/s] 61%|██████ | 1518/2480 [07:08<03:54, 4.10it/s] 61%|██████▏ | 1519/2480 [07:08<03:52, 4.14it/s] 61%|██████▏ | 1520/2480 [07:08<04:36, 3.47it/s] 61%|██████▏ | 1521/2480 [07:09<04:01, 3.97it/s] 61%|██████▏ | 1522/2480 [07:09<04:04, 3.93it/s] 61%|██████▏ | 1523/2480 [07:09<04:02, 3.95it/s] 61%|██████▏ | 1524/2480 [07:09<03:48, 4.19it/s] 61%|██████▏ | 1525/2480 [07:10<03:41, 4.31it/s] 62%|██████▏ | 1526/2480 [07:10<03:44, 4.24it/s] 62%|██████▏ | 1527/2480 [07:10<03:49, 4.15it/s] 62%|██████▏ | 1528/2480 [07:10<04:00, 3.97it/s] 62%|██████▏ | 1529/2480 [07:11<04:08, 3.83it/s] 62%|██████▏ | 1530/2480 [07:11<04:10, 3.80it/s] 62%|██████▏ | 1531/2480 [07:11<04:24, 3.58it/s] 62%|██████▏ | 1532/2480 [07:12<04:46, 3.31it/s] 62%|██████▏ | 1533/2480 [07:12<04:20, 3.64it/s] 62%|██████▏ | 1534/2480 [07:12<04:19, 3.64it/s] 62%|██████▏ | 1535/2480 [07:12<04:16, 3.68it/s] 62%|██████▏ | 1536/2480 [07:13<04:02, 3.90it/s] 62%|██████▏ | 1537/2480 [07:13<03:51, 4.07it/s] 62%|██████▏ | 1538/2480 [07:13<03:48, 4.12it/s] 62%|██████▏ | 1539/2480 [07:13<03:56, 3.97it/s] 62%|██████▏ | 1540/2480 [07:13<03:38, 4.29it/s] 62%|██████▏ | 1541/2480 [07:14<03:30, 4.46it/s] 62%|██████▏ | 1542/2480 [07:14<03:21, 4.66it/s] 62%|██████▏ | 1543/2480 [07:14<03:26, 4.53it/s] 62%|██████▏ | 1544/2480 [07:14<03:33, 4.39it/s] 62%|██████▏ | 1545/2480 [07:15<03:37, 4.29it/s] 62%|██████▏ | 1546/2480 [07:15<04:04, 3.82it/s] 62%|██████▏ | 1547/2480 [07:15<04:09, 3.74it/s] 62%|██████▏ | 1548/2480 [07:15<04:18, 3.60it/s] 62%|██████▏ | 1549/2480 [07:16<04:18, 3.61it/s] 62%|██████▎ | 1550/2480 [07:16<04:17, 3.62it/s] 63%|██████▎ | 1551/2480 [07:16<04:25, 3.50it/s] 63%|██████▎ | 1552/2480 [07:17<04:11, 3.69it/s] 63%|██████▎ | 1553/2480 [07:17<04:10, 3.71it/s] 63%|██████▎ | 1554/2480 [07:17<04:45, 3.25it/s] 63%|██████▎ | 1555/2480 [07:17<04:20, 3.55it/s] 63%|██████▎ | 1556/2480 [07:18<04:00, 3.84it/s] 63%|██████▎ | 1557/2480 [07:18<04:05, 3.76it/s] 63%|██████▎ | 1558/2480 [07:18<03:54, 3.94it/s] 63%|██████▎ | 1559/2480 [07:18<03:41, 4.16it/s] 63%|██████▎ | 1560/2480 [07:19<03:45, 4.08it/s] 63%|██████▎ | 1561/2480 [07:19<04:11, 3.65it/s] 63%|██████▎ | 1562/2480 [07:19<03:44, 4.09it/s] 63%|██████▎ | 1563/2480 [07:19<03:34, 4.28it/s] 63%|██████▎ | 1564/2480 [07:20<03:32, 4.32it/s] 63%|██████▎ | 1565/2480 [07:20<03:46, 4.04it/s] 63%|██████▎ | 1566/2480 [07:20<03:45, 4.06it/s] 63%|██████▎ | 1567/2480 [07:20<03:30, 4.34it/s] 63%|██████▎ | 1568/2480 [07:21<03:25, 4.44it/s] 63%|██████▎ | 1569/2480 [07:21<03:36, 4.22it/s] 63%|██████▎ | 1570/2480 [07:21<03:30, 4.32it/s] 63%|██████▎ | 1571/2480 [07:21<03:30, 4.32it/s] 63%|██████▎ | 1572/2480 [07:21<03:32, 4.28it/s] 63%|██████▎ | 1573/2480 [07:22<03:29, 4.33it/s] 63%|██████▎ | 1574/2480 [07:22<03:33, 4.23it/s] 64%|██████▎ | 1575/2480 [07:22<03:27, 4.36it/s] 64%|██████▎ | 1576/2480 [07:23<04:05, 3.68it/s] 64%|██████▎ | 1577/2480 [07:23<03:58, 3.79it/s] 64%|██████▎ | 1578/2480 [07:23<03:47, 3.96it/s] 64%|██████▎ | 1579/2480 [07:23<03:50, 3.91it/s] 64%|██████▎ | 1580/2480 [07:23<03:35, 4.18it/s] 64%|██████▍ | 1581/2480 [07:24<03:22, 4.44it/s] 64%|██████▍ | 1582/2480 [07:24<03:29, 4.29it/s] 64%|██████▍ | 1583/2480 [07:24<03:25, 4.36it/s] 64%|██████▍ | 1584/2480 [07:25<04:07, 3.62it/s] 64%|██████▍ | 1585/2480 [07:25<03:53, 3.83it/s] 64%|██████▍ | 1586/2480 [07:25<03:47, 3.92it/s] 64%|██████▍ | 1587/2480 [07:25<03:52, 3.84it/s] 64%|██████▍ | 1588/2480 [07:25<03:36, 4.13it/s] 64%|██████▍ | 1589/2480 [07:26<03:35, 4.13it/s] 64%|██████▍ | 1590/2480 [07:26<03:28, 4.28it/s] 64%|██████▍ | 1591/2480 [07:26<03:14, 4.57it/s] 64%|██████▍ | 1592/2480 [07:26<03:42, 4.00it/s] 64%|██████▍ | 1593/2480 [07:27<03:26, 4.30it/s] 64%|██████▍ | 1594/2480 [07:27<03:21, 4.41it/s] 64%|██████▍ | 1595/2480 [07:27<03:22, 4.38it/s] 64%|██████▍ | 1596/2480 [07:27<03:23, 4.33it/s] 64%|██████▍ | 1597/2480 [07:28<03:38, 4.05it/s] 64%|██████▍ | 1598/2480 [07:28<03:30, 4.20it/s] 64%|██████▍ | 1599/2480 [07:28<03:12, 4.58it/s] 65%|██████▍ | 1600/2480 [07:28<03:04, 4.77it/s] 65%|██████▍ | 1601/2480 [07:28<03:00, 4.86it/s] 65%|██████▍ | 1602/2480 [07:29<02:54, 5.02it/s] 65%|██████▍ | 1603/2480 [07:29<02:55, 4.98it/s] 65%|██████▍ | 1604/2480 [07:29<03:35, 4.06it/s] 65%|██████▍ | 1605/2480 [07:29<03:21, 4.33it/s] 65%|██████▍ | 1606/2480 [07:29<03:11, 4.58it/s] 65%|██████▍ | 1607/2480 [07:30<03:58, 3.66it/s] 65%|██████▍ | 1608/2480 [07:30<03:36, 4.03it/s] 65%|██████▍ | 1609/2480 [07:30<03:27, 4.20it/s] 65%|██████▍ | 1610/2480 [07:31<03:17, 4.40it/s] 65%|██████▍ | 1611/2480 [07:31<03:31, 4.10it/s] 65%|██████▌ | 1612/2480 [07:31<03:16, 4.43it/s] 65%|██████▌ | 1613/2480 [07:31<04:00, 3.60it/s] 65%|██████▌ | 1614/2480 [07:32<04:02, 3.58it/s] 65%|██████▌ | 1615/2480 [07:32<03:50, 3.76it/s] 65%|██████▌ | 1616/2480 [07:32<03:42, 3.89it/s] 65%|██████▌ | 1617/2480 [07:32<03:27, 4.16it/s] 65%|██████▌ | 1618/2480 [07:33<03:52, 3.71it/s] 65%|██████▌ | 1619/2480 [07:33<03:55, 3.66it/s] 65%|██████▌ | 1620/2480 [07:33<03:42, 3.86it/s] 65%|██████▌ | 1621/2480 [07:33<03:32, 4.04it/s] 65%|██████▌ | 1622/2480 [07:34<03:21, 4.25it/s] 65%|██████▌ | 1623/2480 [07:34<04:27, 3.20it/s] 65%|██████▌ | 1624/2480 [07:34<04:12, 3.39it/s] 66%|██████▌ | 1625/2480 [07:35<03:45, 3.79it/s] 66%|██████▌ | 1626/2480 [07:35<03:30, 4.06it/s] 66%|██████▌ | 1627/2480 [07:35<04:04, 3.49it/s] 66%|██████▌ | 1628/2480 [07:35<03:54, 3.64it/s] 66%|██████▌ | 1629/2480 [07:36<03:29, 4.07it/s] 66%|██████▌ | 1630/2480 [07:36<03:35, 3.95it/s] 66%|██████▌ | 1631/2480 [07:36<03:20, 4.23it/s] 66%|██████▌ | 1632/2480 [07:36<03:25, 4.13it/s] 66%|██████▌ | 1633/2480 [07:37<03:34, 3.95it/s] 66%|██████▌ | 1634/2480 [07:37<03:26, 4.09it/s] 66%|██████▌ | 1635/2480 [07:37<03:27, 4.07it/s] 66%|██████▌ | 1636/2480 [07:37<03:51, 3.65it/s] 66%|██████▌ | 1637/2480 [07:38<03:40, 3.83it/s] 66%|██████▌ | 1638/2480 [07:38<03:37, 3.87it/s] 66%|██████▌ | 1639/2480 [07:38<03:38, 3.85it/s] 66%|██████▌ | 1640/2480 [07:38<03:52, 3.62it/s] 66%|██████▌ | 1641/2480 [07:39<03:41, 3.79it/s] 66%|██████▌ | 1642/2480 [07:39<03:34, 3.91it/s] 66%|██████▋ | 1643/2480 [07:39<03:25, 4.07it/s] 66%|██████▋ | 1644/2480 [07:39<03:31, 3.95it/s] 66%|██████▋ | 1645/2480 [07:40<03:18, 4.21it/s] 66%|██████▋ | 1646/2480 [07:40<03:25, 4.06it/s] 66%|██████▋ | 1647/2480 [07:40<03:16, 4.24it/s] 66%|██████▋ | 1648/2480 [07:40<03:18, 4.20it/s] 66%|██████▋ | 1649/2480 [07:41<03:26, 4.02it/s] 67%|██████▋ | 1650/2480 [07:41<03:31, 3.93it/s] 67%|██████▋ | 1651/2480 [07:41<03:20, 4.14it/s] 67%|██████▋ | 1652/2480 [07:41<03:14, 4.27it/s] 67%|██████▋ | 1653/2480 [07:42<03:17, 4.20it/s] 67%|██████▋ | 1654/2480 [07:42<03:18, 4.15it/s] 67%|██████▋ | 1655/2480 [07:42<03:13, 4.26it/s] 67%|██████▋ | 1656/2480 [07:42<03:16, 4.19it/s] 67%|██████▋ | 1657/2480 [07:42<03:07, 4.39it/s] 67%|██████▋ | 1658/2480 [07:43<03:20, 4.09it/s] 67%|██████▋ | 1659/2480 [07:43<03:53, 3.51it/s] 67%|██████▋ | 1660/2480 [07:43<03:46, 3.61it/s] 67%|██████▋ | 1661/2480 [07:44<03:57, 3.45it/s] 67%|██████▋ | 1662/2480 [07:44<03:30, 3.88it/s] 67%|██████▋ | 1663/2480 [07:44<03:41, 3.70it/s] 67%|██████▋ | 1664/2480 [07:44<03:29, 3.90it/s] 67%|██████▋ | 1665/2480 [07:45<03:11, 4.26it/s] 67%|██████▋ | 1666/2480 [07:45<03:09, 4.30it/s] 67%|██████▋ | 1667/2480 [07:45<03:02, 4.46it/s] 67%|██████▋ | 1668/2480 [07:45<02:59, 4.53it/s] 67%|██████▋ | 1669/2480 [07:46<03:15, 4.15it/s] 67%|██████▋ | 1670/2480 [07:46<03:17, 4.09it/s] 67%|██████▋ | 1671/2480 [07:46<03:10, 4.24it/s] 67%|██████▋ | 1672/2480 [07:46<02:53, 4.65it/s] 67%|██████▋ | 1673/2480 [07:46<03:02, 4.43it/s] 68%|██████▊ | 1674/2480 [07:47<03:17, 4.09it/s] 68%|██████▊ | 1675/2480 [07:47<03:37, 3.71it/s] 68%|██████▊ | 1676/2480 [07:47<03:18, 4.06it/s] 68%|██████▊ | 1677/2480 [07:47<03:05, 4.33it/s] 68%|██████▊ | 1678/2480 [07:48<03:17, 4.06it/s] 68%|██████▊ | 1679/2480 [07:48<03:30, 3.80it/s] 68%|██████▊ | 1680/2480 [07:48<03:17, 4.05it/s] 68%|██████▊ | 1681/2480 [07:48<03:12, 4.14it/s] 68%|██████▊ | 1682/2480 [07:49<03:02, 4.38it/s] 68%|██████▊ | 1683/2480 [07:49<03:06, 4.27it/s] 68%|██████▊ | 1684/2480 [07:49<02:50, 4.66it/s] 68%|██████▊ | 1685/2480 [07:49<02:49, 4.69it/s] 68%|██████▊ | 1686/2480 [07:49<02:47, 4.75it/s] 68%|██████▊ | 1687/2480 [07:50<02:54, 4.54it/s] 68%|██████▊ | 1688/2480 [07:50<02:48, 4.69it/s] 68%|██████▊ | 1689/2480 [07:50<02:54, 4.54it/s] 68%|██████▊ | 1690/2480 [07:50<03:11, 4.12it/s] 68%|██████▊ | 1691/2480 [07:51<03:00, 4.37it/s] 68%|██████▊ | 1692/2480 [07:51<02:58, 4.41it/s] 68%|██████▊ | 1693/2480 [07:51<03:01, 4.34it/s] 68%|██████▊ | 1694/2480 [07:51<02:49, 4.64it/s] 68%|██████▊ | 1695/2480 [07:52<02:56, 4.44it/s] 68%|██████▊ | 1696/2480 [07:52<02:53, 4.53it/s] 68%|██████▊ | 1697/2480 [07:52<02:56, 4.44it/s] 68%|██████▊ | 1698/2480 [07:52<02:45, 4.72it/s] 69%|██████▊ | 1699/2480 [07:52<02:48, 4.64it/s] 69%|██████▊ | 1700/2480 [07:53<02:58, 4.38it/s] 69%|██████▊ | 1701/2480 [07:53<02:57, 4.39it/s] 69%|██████▊ | 1702/2480 [07:53<03:18, 3.92it/s] 69%|██████▊ | 1703/2480 [07:53<03:08, 4.12it/s] 69%|██████▊ | 1704/2480 [07:54<03:10, 4.06it/s] 69%|██████▉ | 1705/2480 [07:54<03:07, 4.14it/s] 69%|██████▉ | 1706/2480 [07:54<02:54, 4.44it/s] 69%|██████▉ | 1707/2480 [07:54<02:53, 4.47it/s] 69%|██████▉ | 1708/2480 [07:55<03:03, 4.21it/s] 69%|██████▉ | 1709/2480 [07:55<02:58, 4.32it/s] 69%|██████▉ | 1710/2480 [07:55<03:02, 4.22it/s] 69%|██████▉ | 1711/2480 [07:55<02:58, 4.30it/s] 69%|██████▉ | 1712/2480 [07:55<02:43, 4.69it/s] 69%|██████▉ | 1713/2480 [07:56<02:55, 4.37it/s] 69%|██████▉ | 1714/2480 [07:56<03:01, 4.22it/s] 69%|██████▉ | 1715/2480 [07:56<03:06, 4.11it/s] 69%|██████▉ | 1716/2480 [07:56<02:53, 4.40it/s] 69%|██████▉ | 1717/2480 [07:57<02:43, 4.68it/s] 69%|██████▉ | 1718/2480 [07:57<02:32, 5.00it/s] 69%|██████▉ | 1719/2480 [07:57<02:30, 5.05it/s] 69%|██████▉ | 1720/2480 [07:57<02:50, 4.45it/s] 69%|██████▉ | 1721/2480 [07:57<02:38, 4.79it/s] 69%|██████▉ | 1722/2480 [07:58<03:04, 4.10it/s] 69%|██████▉ | 1723/2480 [07:58<02:54, 4.34it/s] 70%|██████▉ | 1724/2480 [07:58<03:00, 4.18it/s] 70%|██████▉ | 1725/2480 [07:58<03:03, 4.12it/s] 70%|██████▉ | 1726/2480 [07:59<03:06, 4.04it/s] 70%|██████▉ | 1727/2480 [07:59<03:12, 3.92it/s] 70%|██████▉ | 1728/2480 [07:59<03:14, 3.86it/s] 70%|██████▉ | 1729/2480 [08:00<03:35, 3.49it/s] 70%|██████▉ | 1730/2480 [08:00<03:20, 3.74it/s] 70%|██████▉ | 1731/2480 [08:00<03:15, 3.84it/s] 70%|██████▉ | 1732/2480 [08:00<03:13, 3.86it/s] 70%|██████▉ | 1733/2480 [08:01<03:37, 3.43it/s] 70%|██████▉ | 1734/2480 [08:01<03:25, 3.63it/s] 70%|██████▉ | 1735/2480 [08:01<03:06, 3.99it/s] 70%|███████ | 1736/2480 [08:01<02:51, 4.33it/s][INFO|trainer.py:811] 2024-09-04 18:35:37,064 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:35:37,067 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:35:37,067 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:35:37,067 >> Batch size = 8 {'eval_loss': 0.2968369126319885, 'eval_precision': 0.6668364747834946, 'eval_recall': 0.7164750957854407, 'eval_f1': 0.6907651715039579, 'eval_accuracy': 0.9486348615611665, 'eval_runtime': 5.4549, 'eval_samples_per_second': 461.787, 'eval_steps_per_second': 57.746, 'epoch': 6.0} {'loss': 0.0084, 'grad_norm': 0.15673314034938812, 'learning_rate': 1.975806451612903e-05, 'epoch': 6.05} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1736 [INFO|configuration_utils.py:472] 2024-09-04 18:35:42,647 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1736/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:35:43,662 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1736/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:35:43,663 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1736/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:35:43,663 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1736/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:35:45,902 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:35:45,903 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 70%|███████ | 1737/2480 [08:10<36:05, 2.91s/it] 70%|███████ | 1738/2480 [08:11<26:05, 2.11s/it] 70%|███████ | 1739/2480 [08:11<19:09, 1.55s/it] 70%|███████ | 1740/2480 [08:11<14:12, 1.15s/it] 70%|███████ | 1741/2480 [08:12<11:18, 1.09it/s] 70%|███████ | 1742/2480 [08:12<08:40, 1.42it/s] 70%|███████ | 1743/2480 [08:12<07:09, 1.72it/s] 70%|███████ | 1744/2480 [08:12<05:47, 2.12it/s] 70%|███████ | 1745/2480 [08:12<04:46, 2.57it/s] 70%|███████ | 1746/2480 [08:13<04:11, 2.91it/s] 70%|███████ | 1747/2480 [08:13<03:55, 3.11it/s] 70%|███████ | 1748/2480 [08:13<03:45, 3.24it/s] 71%|███████ | 1749/2480 [08:13<03:24, 3.57it/s] 71%|███████ | 1750/2480 [08:14<03:18, 3.67it/s] 71%|███████ | 1751/2480 [08:14<03:10, 3.84it/s] 71%|███████ | 1752/2480 [08:14<03:32, 3.43it/s] 71%|███████ | 1753/2480 [08:15<03:33, 3.40it/s] 71%|███████ | 1754/2480 [08:15<03:17, 3.67it/s] 71%|███████ | 1755/2480 [08:15<03:06, 3.88it/s] 71%|███████ | 1756/2480 [08:15<03:00, 4.01it/s] 71%|███████ | 1757/2480 [08:15<02:52, 4.19it/s] 71%|███████ | 1758/2480 [08:16<02:41, 4.46it/s] 71%|███████ | 1759/2480 [08:16<02:54, 4.14it/s] 71%|███████ | 1760/2480 [08:16<02:54, 4.13it/s] 71%|███████ | 1761/2480 [08:16<02:55, 4.09it/s] 71%|███████ | 1762/2480 [08:17<03:05, 3.88it/s] 71%|███████ | 1763/2480 [08:17<03:32, 3.37it/s] 71%|███████ | 1764/2480 [08:17<03:24, 3.50it/s] 71%|███████ | 1765/2480 [08:18<03:13, 3.70it/s] 71%|███████ | 1766/2480 [08:18<03:00, 3.96it/s] 71%|███████▏ | 1767/2480 [08:18<02:47, 4.25it/s] 71%|███████▏ | 1768/2480 [08:18<02:40, 4.43it/s] 71%|███████▏ | 1769/2480 [08:18<02:37, 4.52it/s] 71%|███████▏ | 1770/2480 [08:19<02:24, 4.92it/s] 71%|███████▏ | 1771/2480 [08:19<02:38, 4.46it/s] 71%|███████▏ | 1772/2480 [08:19<02:38, 4.47it/s] 71%|███████▏ | 1773/2480 [08:19<02:33, 4.60it/s] 72%|███████▏ | 1774/2480 [08:19<02:31, 4.66it/s] 72%|███████▏ | 1775/2480 [08:20<03:31, 3.34it/s] 72%|███████▏ | 1776/2480 [08:20<03:01, 3.87it/s] 72%|███████▏ | 1777/2480 [08:20<03:09, 3.72it/s] 72%|███████▏ | 1778/2480 [08:21<02:53, 4.04it/s] 72%|███████▏ | 1779/2480 [08:21<02:49, 4.13it/s] 72%|███████▏ | 1780/2480 [08:21<02:54, 4.02it/s] 72%|███████▏ | 1781/2480 [08:21<02:52, 4.06it/s] 72%|███████▏ | 1782/2480 [08:22<02:43, 4.27it/s] 72%|███████▏ | 1783/2480 [08:22<02:49, 4.12it/s] 72%|███████▏ | 1784/2480 [08:22<02:53, 4.01it/s] 72%|███████▏ | 1785/2480 [08:22<03:13, 3.60it/s] 72%|███████▏ | 1786/2480 [08:23<03:14, 3.57it/s] 72%|███████▏ | 1787/2480 [08:23<02:55, 3.95it/s] 72%|███████▏ | 1788/2480 [08:23<02:40, 4.31it/s] 72%|███████▏ | 1789/2480 [08:23<02:37, 4.39it/s] 72%|███████▏ | 1790/2480 [08:24<02:37, 4.38it/s] 72%|███████▏ | 1791/2480 [08:24<02:37, 4.38it/s] 72%|███████▏ | 1792/2480 [08:24<02:37, 4.36it/s] 72%|███████▏ | 1793/2480 [08:24<02:36, 4.38it/s] 72%|███████▏ | 1794/2480 [08:24<02:38, 4.32it/s] 72%|███████▏ | 1795/2480 [08:25<02:28, 4.61it/s] 72%|███████▏ | 1796/2480 [08:25<02:31, 4.51it/s] 72%|███████▏ | 1797/2480 [08:25<02:44, 4.16it/s] 72%|███████▎ | 1798/2480 [08:25<02:38, 4.30it/s] 73%|███████▎ | 1799/2480 [08:26<02:39, 4.26it/s] 73%|███████▎ | 1800/2480 [08:26<02:26, 4.65it/s] 73%|███████▎ | 1801/2480 [08:26<02:38, 4.27it/s] 73%|███████▎ | 1802/2480 [08:26<02:37, 4.29it/s] 73%|███████▎ | 1803/2480 [08:27<02:46, 4.07it/s] 73%|███████▎ | 1804/2480 [08:27<03:00, 3.75it/s] 73%|███████▎ | 1805/2480 [08:27<02:56, 3.82it/s] 73%|███████▎ | 1806/2480 [08:27<02:56, 3.83it/s] 73%|███████▎ | 1807/2480 [08:28<03:05, 3.63it/s] 73%|███████▎ | 1808/2480 [08:28<02:51, 3.92it/s] 73%|███████▎ | 1809/2480 [08:28<02:46, 4.02it/s] 73%|███████▎ | 1810/2480 [08:28<02:46, 4.01it/s] 73%|███████▎ | 1811/2480 [08:29<02:50, 3.92it/s] 73%|███████▎ | 1812/2480 [08:29<02:47, 3.98it/s] 73%|███████▎ | 1813/2480 [08:29<03:08, 3.55it/s] 73%|███████▎ | 1814/2480 [08:30<03:05, 3.60it/s] 73%|███████▎ | 1815/2480 [08:30<02:53, 3.84it/s] 73%|███████▎ | 1816/2480 [08:30<02:33, 4.31it/s] 73%|███████▎ | 1817/2480 [08:30<02:35, 4.26it/s] 73%|███████▎ | 1818/2480 [08:30<02:28, 4.44it/s] 73%|███████▎ | 1819/2480 [08:31<02:22, 4.64it/s] 73%|███████▎ | 1820/2480 [08:31<02:23, 4.60it/s] 73%|███████▎ | 1821/2480 [08:31<02:21, 4.65it/s] 73%|███████▎ | 1822/2480 [08:31<02:21, 4.66it/s] 74%|███████▎ | 1823/2480 [08:31<02:21, 4.65it/s] 74%|███████▎ | 1824/2480 [08:32<02:22, 4.61it/s] 74%|███████▎ | 1825/2480 [08:32<02:22, 4.61it/s] 74%|███████▎ | 1826/2480 [08:32<02:26, 4.48it/s] 74%|███████▎ | 1827/2480 [08:32<02:30, 4.34it/s] 74%|███████▎ | 1828/2480 [08:33<03:00, 3.62it/s] 74%|███████▍ | 1829/2480 [08:33<02:44, 3.96it/s] 74%|███████▍ | 1830/2480 [08:33<02:50, 3.81it/s] 74%|███████▍ | 1831/2480 [08:33<02:50, 3.80it/s] 74%|███████▍ | 1832/2480 [08:34<02:58, 3.64it/s] 74%|███████▍ | 1833/2480 [08:34<02:59, 3.60it/s] 74%|███████▍ | 1834/2480 [08:34<02:42, 3.97it/s] 74%|███████▍ | 1835/2480 [08:35<03:06, 3.45it/s] 74%|███████▍ | 1836/2480 [08:35<03:18, 3.25it/s] 74%|███████▍ | 1837/2480 [08:35<02:58, 3.60it/s] 74%|███████▍ | 1838/2480 [08:35<02:46, 3.86it/s] 74%|███████▍ | 1839/2480 [08:36<02:42, 3.94it/s] 74%|███████▍ | 1840/2480 [08:36<02:33, 4.17it/s] 74%|███████▍ | 1841/2480 [08:36<02:30, 4.23it/s] 74%|███████▍ | 1842/2480 [08:36<02:22, 4.46it/s] 74%|███████▍ | 1843/2480 [08:37<02:28, 4.30it/s] 74%|███████▍ | 1844/2480 [08:37<02:20, 4.52it/s] 74%|███████▍ | 1845/2480 [08:37<02:20, 4.51it/s] 74%|███████▍ | 1846/2480 [08:37<02:19, 4.55it/s] 74%|███████▍ | 1847/2480 [08:37<02:39, 3.96it/s] 75%|███████▍ | 1848/2480 [08:38<02:40, 3.93it/s] 75%|███████▍ | 1849/2480 [08:38<02:41, 3.91it/s] 75%|███████▍ | 1850/2480 [08:38<02:27, 4.26it/s] 75%|███████▍ | 1851/2480 [08:38<02:37, 4.00it/s] 75%|███████▍ | 1852/2480 [08:39<02:28, 4.23it/s] 75%|███████▍ | 1853/2480 [08:39<02:26, 4.27it/s] 75%|███████▍ | 1854/2480 [08:39<02:27, 4.23it/s] 75%|███████▍ | 1855/2480 [08:39<02:29, 4.17it/s] 75%|███████▍ | 1856/2480 [08:40<02:30, 4.16it/s] 75%|███████▍ | 1857/2480 [08:40<02:21, 4.39it/s] 75%|███████▍ | 1858/2480 [08:40<02:25, 4.27it/s] 75%|███████▍ | 1859/2480 [08:40<02:15, 4.57it/s] 75%|███████▌ | 1860/2480 [08:40<02:11, 4.71it/s] 75%|███████▌ | 1861/2480 [08:41<02:18, 4.48it/s] 75%|███████▌ | 1862/2480 [08:41<02:26, 4.23it/s] 75%|███████▌ | 1863/2480 [08:41<02:36, 3.94it/s] 75%|███████▌ | 1864/2480 [08:41<02:25, 4.23it/s] 75%|███████▌ | 1865/2480 [08:42<02:26, 4.21it/s] 75%|███████▌ | 1866/2480 [08:42<02:26, 4.20it/s] 75%|███████▌ | 1867/2480 [08:42<02:53, 3.53it/s] 75%|███████▌ | 1868/2480 [08:43<02:47, 3.64it/s] 75%|███████▌ | 1869/2480 [08:43<02:39, 3.83it/s] 75%|███████▌ | 1870/2480 [08:43<02:51, 3.56it/s] 75%|███████▌ | 1871/2480 [08:44<03:17, 3.09it/s] 75%|███████▌ | 1872/2480 [08:44<03:18, 3.07it/s] 76%|███████▌ | 1873/2480 [08:44<02:59, 3.38it/s] 76%|███████▌ | 1874/2480 [08:44<02:47, 3.61it/s] 76%|███████▌ | 1875/2480 [08:45<02:32, 3.98it/s] 76%|███████▌ | 1876/2480 [08:45<02:31, 3.98it/s] 76%|███████▌ | 1877/2480 [08:45<02:28, 4.07it/s] 76%|███████▌ | 1878/2480 [08:45<02:31, 3.97it/s] 76%|███████▌ | 1879/2480 [08:46<02:36, 3.83it/s] 76%|███████▌ | 1880/2480 [08:46<02:33, 3.91it/s] 76%|███████▌ | 1881/2480 [08:46<02:30, 3.97it/s] 76%|███████▌ | 1882/2480 [08:46<02:21, 4.24it/s] 76%|███████▌ | 1883/2480 [08:47<02:27, 4.06it/s] 76%|███████▌ | 1884/2480 [08:47<02:20, 4.23it/s] 76%|███████▌ | 1885/2480 [08:47<02:13, 4.47it/s] 76%|███████▌ | 1886/2480 [08:47<02:06, 4.68it/s] 76%|███████▌ | 1887/2480 [08:47<02:24, 4.11it/s] 76%|███████▌ | 1888/2480 [08:48<02:23, 4.12it/s] 76%|███████▌ | 1889/2480 [08:48<02:18, 4.27it/s] 76%|███████▌ | 1890/2480 [08:48<02:14, 4.39it/s] 76%|███████▋ | 1891/2480 [08:48<02:27, 4.00it/s] 76%|███████▋ | 1892/2480 [08:49<02:20, 4.20it/s] 76%|███████▋ | 1893/2480 [08:49<02:26, 4.00it/s] 76%|███████▋ | 1894/2480 [08:49<02:26, 3.99it/s] 76%|███████▋ | 1895/2480 [08:50<02:47, 3.50it/s] 76%|███████▋ | 1896/2480 [08:50<02:35, 3.76it/s] 76%|███████▋ | 1897/2480 [08:50<02:29, 3.90it/s] 77%|███████▋ | 1898/2480 [08:50<02:23, 4.06it/s] 77%|███████▋ | 1899/2480 [08:51<02:43, 3.55it/s] 77%|███████▋ | 1900/2480 [08:51<02:28, 3.91it/s] 77%|███████▋ | 1901/2480 [08:51<02:17, 4.22it/s] 77%|███████▋ | 1902/2480 [08:51<02:13, 4.32it/s] 77%|███████▋ | 1903/2480 [08:51<02:06, 4.55it/s] 77%|███████▋ | 1904/2480 [08:52<02:19, 4.12it/s] 77%|███████▋ | 1905/2480 [08:52<02:14, 4.28it/s] 77%|███████▋ | 1906/2480 [08:52<02:13, 4.30it/s] 77%|███████▋ | 1907/2480 [08:52<02:10, 4.38it/s] 77%|███████▋ | 1908/2480 [08:53<02:10, 4.38it/s] 77%|███████▋ | 1909/2480 [08:53<02:32, 3.75it/s] 77%|███████▋ | 1910/2480 [08:53<02:33, 3.72it/s] 77%|███████▋ | 1911/2480 [08:53<02:28, 3.82it/s] 77%|███████▋ | 1912/2480 [08:54<02:22, 3.97it/s] 77%|███████▋ | 1913/2480 [08:54<02:16, 4.15it/s] 77%|███████▋ | 1914/2480 [08:54<02:27, 3.83it/s] 77%|███████▋ | 1915/2480 [08:54<02:26, 3.87it/s] 77%|███████▋ | 1916/2480 [08:55<02:24, 3.89it/s] 77%|███████▋ | 1917/2480 [08:55<02:26, 3.85it/s] 77%|███████▋ | 1918/2480 [08:55<02:17, 4.08it/s] 77%|███████▋ | 1919/2480 [08:55<02:10, 4.30it/s] 77%|███████▋ | 1920/2480 [08:56<02:27, 3.81it/s] 77%|███████▋ | 1921/2480 [08:56<02:22, 3.91it/s] 78%|███████▊ | 1922/2480 [08:56<02:16, 4.09it/s] 78%|███████▊ | 1923/2480 [08:56<02:22, 3.91it/s] 78%|███████▊ | 1924/2480 [08:57<02:10, 4.25it/s] 78%|███████▊ | 1925/2480 [08:57<02:02, 4.53it/s] 78%|███████▊ | 1926/2480 [08:57<01:57, 4.71it/s] 78%|███████▊ | 1927/2480 [08:57<02:11, 4.21it/s] 78%|███████▊ | 1928/2480 [08:58<02:24, 3.82it/s] 78%|███████▊ | 1929/2480 [08:58<02:23, 3.85it/s] 78%|███████▊ | 1930/2480 [08:58<02:19, 3.96it/s] 78%|███████▊ | 1931/2480 [08:58<02:16, 4.01it/s] 78%|███████▊ | 1932/2480 [08:59<02:09, 4.23it/s] 78%|███████▊ | 1933/2480 [08:59<02:11, 4.17it/s] 78%|███████▊ | 1934/2480 [08:59<02:12, 4.12it/s] 78%|███████▊ | 1935/2480 [08:59<02:10, 4.18it/s] 78%|███████▊ | 1936/2480 [09:00<02:12, 4.11it/s] 78%|███████▊ | 1937/2480 [09:00<02:10, 4.17it/s] 78%|███████▊ | 1938/2480 [09:00<02:16, 3.97it/s] 78%|███████▊ | 1939/2480 [09:00<02:07, 4.25it/s] 78%|███████▊ | 1940/2480 [09:00<02:05, 4.30it/s] 78%|███████▊ | 1941/2480 [09:01<02:13, 4.05it/s] 78%|███████▊ | 1942/2480 [09:01<02:04, 4.30it/s] 78%|███████▊ | 1943/2480 [09:01<02:10, 4.11it/s] 78%|███████▊ | 1944/2480 [09:01<02:09, 4.15it/s] 78%|███████▊ | 1945/2480 [09:02<02:11, 4.05it/s] 78%|███████▊ | 1946/2480 [09:02<02:06, 4.23it/s] 79%|███████▊ | 1947/2480 [09:02<02:31, 3.51it/s] 79%|███████▊ | 1948/2480 [09:03<02:25, 3.66it/s] 79%|███████▊ | 1949/2480 [09:03<02:14, 3.95it/s] 79%|███████▊ | 1950/2480 [09:03<02:14, 3.95it/s] 79%|███████▊ | 1951/2480 [09:03<02:21, 3.73it/s] 79%|███████▊ | 1952/2480 [09:04<02:18, 3.83it/s] 79%|███████▉ | 1953/2480 [09:04<02:13, 3.95it/s] 79%|███████▉ | 1954/2480 [09:04<02:09, 4.07it/s] 79%|███████▉ | 1955/2480 [09:04<02:00, 4.34it/s] 79%|███████▉ | 1956/2480 [09:04<01:55, 4.54it/s] 79%|███████▉ | 1957/2480 [09:05<01:51, 4.68it/s] 79%|███████▉ | 1958/2480 [09:05<01:49, 4.75it/s] 79%|███████▉ | 1959/2480 [09:05<02:00, 4.31it/s] 79%|███████▉ | 1960/2480 [09:05<01:57, 4.42it/s] 79%|███████▉ | 1961/2480 [09:06<02:15, 3.84it/s] 79%|███████▉ | 1962/2480 [09:06<02:08, 4.02it/s] 79%|███████▉ | 1963/2480 [09:06<02:06, 4.07it/s] 79%|███████▉ | 1964/2480 [09:06<02:00, 4.29it/s] 79%|███████▉ | 1965/2480 [09:07<01:58, 4.36it/s] 79%|███████▉ | 1966/2480 [09:07<01:58, 4.32it/s] 79%|███████▉ | 1967/2480 [09:07<02:00, 4.25it/s] 79%|███████▉ | 1968/2480 [09:07<02:07, 4.03it/s] 79%|███████▉ | 1969/2480 [09:08<02:08, 3.97it/s] 79%|███████▉ | 1970/2480 [09:08<02:01, 4.19it/s] 79%|███████▉ | 1971/2480 [09:08<02:08, 3.96it/s] 80%|███████▉ | 1972/2480 [09:08<01:56, 4.36it/s] 80%|███████▉ | 1973/2480 [09:09<01:59, 4.25it/s] 80%|███████▉ | 1974/2480 [09:09<01:55, 4.39it/s] 80%|███████▉ | 1975/2480 [09:09<01:50, 4.59it/s] 80%|███████▉ | 1976/2480 [09:09<01:46, 4.75it/s] 80%|███████▉ | 1977/2480 [09:09<02:06, 3.98it/s] 80%|███████▉ | 1978/2480 [09:10<02:00, 4.18it/s] 80%|███████▉ | 1979/2480 [09:10<01:53, 4.41it/s] 80%|███████▉ | 1980/2480 [09:10<01:47, 4.66it/s] 80%|███████▉ | 1981/2480 [09:10<01:52, 4.45it/s] 80%|███████▉ | 1982/2480 [09:11<01:56, 4.27it/s] 80%|███████▉ | 1983/2480 [09:11<02:02, 4.07it/s] 80%|████████ | 1984/2480 [09:11<01:51, 4.45it/s][INFO|trainer.py:811] 2024-09-04 18:36:46,809 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:36:46,812 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:36:46,812 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:36:46,812 >> Batch size = 8 {'eval_loss': 0.3088673949241638, 'eval_precision': 0.6896, 'eval_recall': 0.7077175697865353, 'eval_f1': 0.6985413290113451, 'eval_accuracy': 0.9496936058263018, 'eval_runtime': 5.5771, 'eval_samples_per_second': 451.669, 'eval_steps_per_second': 56.481, 'epoch': 7.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-1984 [INFO|configuration_utils.py:472] 2024-09-04 18:36:52,116 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-1984/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:36:53,125 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-1984/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:36:53,126 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-1984/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:36:53,126 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-1984/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:36:55,563 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:36:55,564 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 80%|████████ | 1985/2480 [09:20<23:38, 2.87s/it] 80%|████████ | 1986/2480 [09:20<17:05, 2.08s/it] 80%|████████ | 1987/2480 [09:21<12:36, 1.53s/it] 80%|████████ | 1988/2480 [09:21<09:16, 1.13s/it] 80%|████████ | 1989/2480 [09:21<07:01, 1.17it/s] 80%|████████ | 1990/2480 [09:21<05:20, 1.53it/s] 80%|████████ | 1991/2480 [09:21<04:18, 1.90it/s] 80%|████████ | 1992/2480 [09:22<03:46, 2.15it/s] 80%|████████ | 1993/2480 [09:22<03:15, 2.49it/s] 80%|████████ | 1994/2480 [09:22<02:44, 2.96it/s] 80%|████████ | 1995/2480 [09:22<02:30, 3.23it/s] 80%|████████ | 1996/2480 [09:23<02:19, 3.47it/s] 81%|████████ | 1997/2480 [09:23<02:05, 3.84it/s] 81%|████████ | 1998/2480 [09:23<02:05, 3.85it/s] 81%|████████ | 1999/2480 [09:23<01:58, 4.05it/s] 81%|████████ | 2000/2480 [09:24<02:00, 3.98it/s] 81%|████████ | 2000/2480 [09:24<02:00, 3.98it/s] 81%|████████ | 2001/2480 [09:24<02:07, 3.75it/s] 81%|████████ | 2002/2480 [09:24<02:15, 3.52it/s] 81%|████████ | 2003/2480 [09:24<02:09, 3.69it/s] 81%|████████ | 2004/2480 [09:25<02:05, 3.78it/s] 81%|████████ | 2005/2480 [09:25<02:16, 3.47it/s] 81%|████████ | 2006/2480 [09:25<02:09, 3.67it/s] 81%|████████ | 2007/2480 [09:25<02:01, 3.88it/s] 81%|████████ | 2008/2480 [09:26<01:52, 4.18it/s] 81%|████████ | 2009/2480 [09:26<01:42, 4.61it/s] 81%|████████ | 2010/2480 [09:26<01:54, 4.09it/s] 81%|████████ | 2011/2480 [09:26<01:56, 4.03it/s] 81%|████████ | 2012/2480 [09:27<01:49, 4.26it/s] 81%|████████ | 2013/2480 [09:27<01:52, 4.15it/s] 81%|████████ | 2014/2480 [09:27<01:48, 4.29it/s] 81%|████████▏ | 2015/2480 [09:27<01:47, 4.33it/s] 81%|████████▏ | 2016/2480 [09:27<01:44, 4.43it/s] 81%|████████▏ | 2017/2480 [09:28<01:43, 4.47it/s] 81%|████████▏ | 2018/2480 [09:28<01:41, 4.55it/s] 81%|████████▏ | 2019/2480 [09:28<01:35, 4.81it/s] 81%|████████▏ | 2020/2480 [09:28<01:39, 4.64it/s] 81%|████████▏ | 2021/2480 [09:29<01:47, 4.28it/s] 82%|████████▏ | 2022/2480 [09:29<01:48, 4.21it/s] 82%|████████▏ | 2023/2480 [09:29<01:57, 3.90it/s] 82%|████████▏ | 2024/2480 [09:29<01:53, 4.03it/s] 82%|████████▏ | 2025/2480 [09:30<01:50, 4.13it/s] 82%|████████▏ | 2026/2480 [09:30<01:47, 4.21it/s] 82%|████████▏ | 2027/2480 [09:30<01:51, 4.05it/s] 82%|████████▏ | 2028/2480 [09:31<02:15, 3.33it/s] 82%|████████▏ | 2029/2480 [09:31<02:03, 3.64it/s] 82%|████████▏ | 2030/2480 [09:31<01:57, 3.84it/s] 82%|████████▏ | 2031/2480 [09:31<01:58, 3.78it/s] 82%|████████▏ | 2032/2480 [09:32<02:01, 3.70it/s] 82%|████████▏ | 2033/2480 [09:32<01:52, 3.99it/s] 82%|████████▏ | 2034/2480 [09:32<01:56, 3.82it/s] 82%|████████▏ | 2035/2480 [09:32<01:58, 3.76it/s] 82%|████████▏ | 2036/2480 [09:33<02:00, 3.67it/s] 82%|████████▏ | 2037/2480 [09:33<01:51, 3.97it/s] 82%|████████▏ | 2038/2480 [09:33<01:48, 4.07it/s] 82%|████████▏ | 2039/2480 [09:33<01:44, 4.22it/s] 82%|████████▏ | 2040/2480 [09:33<01:40, 4.36it/s] 82%|████████▏ | 2041/2480 [09:34<01:44, 4.20it/s] 82%|████████▏ | 2042/2480 [09:34<01:47, 4.09it/s] 82%|████████▏ | 2043/2480 [09:34<01:42, 4.26it/s] 82%|████████▏ | 2044/2480 [09:34<01:50, 3.94it/s] 82%|████████▏ | 2045/2480 [09:35<01:47, 4.06it/s] 82%|████████▎ | 2046/2480 [09:35<01:46, 4.07it/s] 83%|████████▎ | 2047/2480 [09:35<01:44, 4.14it/s] 83%|████████▎ | 2048/2480 [09:35<01:39, 4.35it/s] 83%|████████▎ | 2049/2480 [09:36<01:49, 3.92it/s] 83%|████████▎ | 2050/2480 [09:36<01:39, 4.34it/s] 83%|████████▎ | 2051/2480 [09:36<01:36, 4.44it/s] 83%|████████▎ | 2052/2480 [09:36<01:44, 4.11it/s] 83%|████████▎ | 2053/2480 [09:37<01:41, 4.21it/s] 83%|████████▎ | 2054/2480 [09:37<01:41, 4.19it/s] 83%|████████▎ | 2055/2480 [09:37<01:39, 4.29it/s] 83%|████████▎ | 2056/2480 [09:37<01:33, 4.51it/s] 83%|████████▎ | 2057/2480 [09:37<01:33, 4.52it/s] 83%|████████▎ | 2058/2480 [09:38<01:45, 4.00it/s] 83%|████████▎ | 2059/2480 [09:38<01:37, 4.30it/s] 83%|████████▎ | 2060/2480 [09:38<01:36, 4.37it/s] 83%|████████▎ | 2061/2480 [09:38<01:41, 4.14it/s] 83%|████████▎ | 2062/2480 [09:39<01:44, 3.99it/s] 83%|████████▎ | 2063/2480 [09:39<01:47, 3.89it/s] 83%|████████▎ | 2064/2480 [09:39<01:41, 4.09it/s] 83%|████████▎ | 2065/2480 [09:39<01:38, 4.20it/s] 83%|████████▎ | 2066/2480 [09:40<01:35, 4.32it/s] 83%|████████▎ | 2067/2480 [09:40<01:42, 4.05it/s] 83%|████████▎ | 2068/2480 [09:40<01:48, 3.80it/s] 83%|████████▎ | 2069/2480 [09:40<01:41, 4.03it/s] 83%|████████▎ | 2070/2480 [09:41<01:39, 4.10it/s] 84%|████████▎ | 2071/2480 [09:41<01:36, 4.23it/s] 84%|████████▎ | 2072/2480 [09:41<01:53, 3.59it/s] 84%|████████▎ | 2073/2480 [09:42<02:18, 2.93it/s] 84%|████████▎ | 2074/2480 [09:42<02:25, 2.78it/s] 84%|████████▎ | 2075/2480 [09:42<02:16, 2.97it/s] 84%|████████▎ | 2076/2480 [09:43<02:16, 2.97it/s] 84%|████████▍ | 2077/2480 [09:43<01:59, 3.38it/s] 84%|████████▍ | 2078/2480 [09:43<01:59, 3.38it/s] 84%|████████▍ | 2079/2480 [09:44<01:49, 3.67it/s] 84%|████████▍ | 2080/2480 [09:44<01:39, 4.01it/s] 84%|████████▍ | 2081/2480 [09:44<01:34, 4.20it/s] 84%|████████▍ | 2082/2480 [09:44<01:34, 4.21it/s] 84%|████████▍ | 2083/2480 [09:44<01:36, 4.10it/s] 84%|████████▍ | 2084/2480 [09:45<01:38, 4.01it/s] 84%|████████▍ | 2085/2480 [09:45<01:36, 4.09it/s] 84%|████████▍ | 2086/2480 [09:45<01:39, 3.95it/s] 84%|████████▍ | 2087/2480 [09:45<01:32, 4.27it/s] 84%|████████▍ | 2088/2480 [09:46<01:34, 4.15it/s] 84%|████████▍ | 2089/2480 [09:46<01:28, 4.43it/s] 84%|████████▍ | 2090/2480 [09:46<01:27, 4.45it/s] 84%|████████▍ | 2091/2480 [09:46<01:21, 4.77it/s] 84%|████████▍ | 2092/2480 [09:46<01:21, 4.74it/s] 84%|████████▍ | 2093/2480 [09:47<01:32, 4.20it/s] 84%|████████▍ | 2094/2480 [09:47<01:30, 4.26it/s] 84%|████████▍ | 2095/2480 [09:47<01:28, 4.34it/s] 85%|████████▍ | 2096/2480 [09:48<01:41, 3.80it/s] 85%|████████▍ | 2097/2480 [09:48<01:36, 3.98it/s] 85%|████████▍ | 2098/2480 [09:48<01:41, 3.76it/s] 85%|████████▍ | 2099/2480 [09:48<01:34, 4.04it/s] 85%|████████▍ | 2100/2480 [09:48<01:29, 4.26it/s] 85%|████████▍ | 2101/2480 [09:49<01:26, 4.40it/s] 85%|████████▍ | 2102/2480 [09:49<01:37, 3.87it/s] 85%|████████▍ | 2103/2480 [09:49<01:49, 3.44it/s] 85%|████████▍ | 2104/2480 [09:50<01:42, 3.66it/s] 85%|████████▍ | 2105/2480 [09:50<01:38, 3.79it/s] 85%|████████▍ | 2106/2480 [09:50<01:30, 4.14it/s] 85%|████████▍ | 2107/2480 [09:50<01:28, 4.20it/s] 85%|████████▌ | 2108/2480 [09:50<01:26, 4.32it/s] 85%|████████▌ | 2109/2480 [09:51<01:25, 4.33it/s] 85%|████████▌ | 2110/2480 [09:51<01:26, 4.28it/s] 85%|████████▌ | 2111/2480 [09:51<01:27, 4.21it/s] 85%|████████▌ | 2112/2480 [09:51<01:25, 4.33it/s] 85%|████████▌ | 2113/2480 [09:52<01:25, 4.27it/s] 85%|████████▌ | 2114/2480 [09:52<01:27, 4.19it/s] 85%|████████▌ | 2115/2480 [09:52<01:25, 4.25it/s] 85%|████████▌ | 2116/2480 [09:52<01:29, 4.08it/s] 85%|████████▌ | 2117/2480 [09:53<01:25, 4.26it/s] 85%|████████▌ | 2118/2480 [09:53<01:28, 4.11it/s] 85%|████████▌ | 2119/2480 [09:53<01:24, 4.28it/s] 85%|████████▌ | 2120/2480 [09:53<01:20, 4.47it/s] 86%|████████▌ | 2121/2480 [09:54<01:31, 3.93it/s] 86%|████████▌ | 2122/2480 [09:54<01:41, 3.52it/s] 86%|████████▌ | 2123/2480 [09:54<01:38, 3.63it/s] 86%|████████▌ | 2124/2480 [09:54<01:37, 3.64it/s] 86%|████████▌ | 2125/2480 [09:55<01:43, 3.42it/s] 86%|████████▌ | 2126/2480 [09:55<01:33, 3.77it/s] 86%|████████▌ | 2127/2480 [09:55<01:29, 3.94it/s] 86%|████████▌ | 2128/2480 [09:55<01:22, 4.25it/s] 86%|████████▌ | 2129/2480 [09:56<01:27, 4.02it/s] 86%|████████▌ | 2130/2480 [09:56<01:47, 3.26it/s] 86%|████████▌ | 2131/2480 [09:56<01:35, 3.67it/s] 86%|████████▌ | 2132/2480 [09:57<01:30, 3.84it/s] 86%|████████▌ | 2133/2480 [09:57<01:27, 3.95it/s] 86%|████████▌ | 2134/2480 [09:57<01:21, 4.26it/s] 86%|████████▌ | 2135/2480 [09:57<01:29, 3.84it/s] 86%|████████▌ | 2136/2480 [09:58<01:27, 3.94it/s] 86%|████████▌ | 2137/2480 [09:58<01:34, 3.62it/s] 86%|████████▌ | 2138/2480 [09:58<01:26, 3.94it/s] 86%|████████▋ | 2139/2480 [09:58<01:25, 4.01it/s] 86%|████████▋ | 2140/2480 [09:59<01:27, 3.86it/s] 86%|████████▋ | 2141/2480 [09:59<01:19, 4.25it/s] 86%|████████▋ | 2142/2480 [09:59<01:24, 3.99it/s] 86%|████████▋ | 2143/2480 [09:59<01:19, 4.23it/s] 86%|████████▋ | 2144/2480 [09:59<01:15, 4.44it/s] 86%|████████▋ | 2145/2480 [10:00<01:12, 4.64it/s] 87%|████████▋ | 2146/2480 [10:00<01:26, 3.84it/s] 87%|████████▋ | 2147/2480 [10:00<01:20, 4.14it/s] 87%|████████▋ | 2148/2480 [10:00<01:19, 4.15it/s] 87%|████████▋ | 2149/2480 [10:01<01:27, 3.77it/s] 87%|████████▋ | 2150/2480 [10:01<01:24, 3.90it/s] 87%|████████▋ | 2151/2480 [10:01<01:23, 3.92it/s] 87%|████████▋ | 2152/2480 [10:02<01:20, 4.09it/s] 87%|████████▋ | 2153/2480 [10:02<01:15, 4.30it/s] 87%|████████▋ | 2154/2480 [10:02<01:12, 4.51it/s] 87%|████████▋ | 2155/2480 [10:02<01:15, 4.28it/s] 87%|████████▋ | 2156/2480 [10:02<01:20, 4.03it/s] 87%|████████▋ | 2157/2480 [10:03<01:24, 3.83it/s] 87%|████████▋ | 2158/2480 [10:03<01:17, 4.15it/s] 87%|████████▋ | 2159/2480 [10:03<01:26, 3.73it/s] 87%|████████▋ | 2160/2480 [10:03<01:19, 4.03it/s] 87%|████████▋ | 2161/2480 [10:04<01:22, 3.86it/s] 87%|████████▋ | 2162/2480 [10:04<01:20, 3.94it/s] 87%|████████▋ | 2163/2480 [10:04<01:15, 4.22it/s] 87%|████████▋ | 2164/2480 [10:04<01:13, 4.31it/s] 87%|████████▋ | 2165/2480 [10:05<01:12, 4.35it/s] 87%|████████▋ | 2166/2480 [10:05<01:11, 4.39it/s] 87%|████████▋ | 2167/2480 [10:05<01:10, 4.44it/s] 87%|████████▋ | 2168/2480 [10:05<01:09, 4.50it/s] 87%|████████▋ | 2169/2480 [10:06<01:14, 4.17it/s] 88%|████████▊ | 2170/2480 [10:06<01:22, 3.77it/s] 88%|████████▊ | 2171/2480 [10:06<01:18, 3.95it/s] 88%|████████▊ | 2172/2480 [10:06<01:12, 4.22it/s] 88%|████████▊ | 2173/2480 [10:07<01:10, 4.34it/s] 88%|████████▊ | 2174/2480 [10:07<01:09, 4.42it/s] 88%|████████▊ | 2175/2480 [10:07<01:05, 4.64it/s] 88%|████████▊ | 2176/2480 [10:07<01:11, 4.26it/s] 88%|████████▊ | 2177/2480 [10:08<01:13, 4.12it/s] 88%|████████▊ | 2178/2480 [10:08<01:08, 4.43it/s] 88%|████████▊ | 2179/2480 [10:08<01:07, 4.43it/s] 88%|████████▊ | 2180/2480 [10:08<01:05, 4.60it/s] 88%|████████▊ | 2181/2480 [10:08<01:07, 4.43it/s] 88%|████████▊ | 2182/2480 [10:09<01:06, 4.47it/s] 88%|████████▊ | 2183/2480 [10:09<01:02, 4.73it/s] 88%|████████▊ | 2184/2480 [10:09<01:02, 4.71it/s] 88%|████████▊ | 2185/2480 [10:09<01:02, 4.71it/s] 88%|████████▊ | 2186/2480 [10:09<01:00, 4.83it/s] 88%|████████▊ | 2187/2480 [10:10<01:13, 3.97it/s] 88%|████████▊ | 2188/2480 [10:10<01:14, 3.92it/s] 88%|████████▊ | 2189/2480 [10:10<01:14, 3.92it/s] 88%|████████▊ | 2190/2480 [10:10<01:09, 4.15it/s] 88%|████████▊ | 2191/2480 [10:11<01:14, 3.88it/s] 88%|████████▊ | 2192/2480 [10:11<01:26, 3.34it/s] 88%|████████▊ | 2193/2480 [10:11<01:21, 3.51it/s] 88%|████████▊ | 2194/2480 [10:12<01:15, 3.81it/s] 89%|████████▊ | 2195/2480 [10:12<01:13, 3.89it/s] 89%|████████▊ | 2196/2480 [10:12<01:09, 4.09it/s] 89%|████████▊ | 2197/2480 [10:12<01:06, 4.26it/s] 89%|████████▊ | 2198/2480 [10:12<01:02, 4.49it/s] 89%|████████▊ | 2199/2480 [10:13<01:13, 3.82it/s] 89%|████████▊ | 2200/2480 [10:13<01:08, 4.09it/s] 89%|████████▉ | 2201/2480 [10:13<01:05, 4.28it/s] 89%|████████▉ | 2202/2480 [10:14<01:09, 4.01it/s] 89%|████████▉ | 2203/2480 [10:14<01:10, 3.96it/s] 89%|████████▉ | 2204/2480 [10:14<01:06, 4.15it/s] 89%|████████▉ | 2205/2480 [10:14<01:06, 4.16it/s] 89%|████████▉ | 2206/2480 [10:14<01:01, 4.45it/s] 89%|████████▉ | 2207/2480 [10:15<00:58, 4.66it/s] 89%|████████▉ | 2208/2480 [10:15<00:56, 4.80it/s] 89%|████████▉ | 2209/2480 [10:15<01:00, 4.50it/s] 89%|████████▉ | 2210/2480 [10:15<01:12, 3.71it/s] 89%|████████▉ | 2211/2480 [10:16<01:10, 3.79it/s] 89%|████████▉ | 2212/2480 [10:16<01:07, 3.99it/s] 89%|████████▉ | 2213/2480 [10:16<01:04, 4.13it/s] 89%|████████▉ | 2214/2480 [10:16<01:08, 3.87it/s] 89%|████████▉ | 2215/2480 [10:17<01:05, 4.04it/s] 89%|████████▉ | 2216/2480 [10:17<01:01, 4.29it/s] 89%|████████▉ | 2217/2480 [10:17<00:59, 4.41it/s] 89%|████████▉ | 2218/2480 [10:17<00:55, 4.75it/s] 89%|████████▉ | 2219/2480 [10:18<00:57, 4.53it/s] 90%|████████▉ | 2220/2480 [10:18<00:55, 4.66it/s] 90%|████████▉ | 2221/2480 [10:18<00:58, 4.45it/s] 90%|████████▉ | 2222/2480 [10:18<00:58, 4.40it/s] 90%|████████▉ | 2223/2480 [10:18<00:57, 4.50it/s] 90%|████████▉ | 2224/2480 [10:19<00:58, 4.39it/s] 90%|████████▉ | 2225/2480 [10:19<00:55, 4.56it/s] 90%|████████▉ | 2226/2480 [10:19<00:55, 4.55it/s] 90%|████████▉ | 2227/2480 [10:19<00:58, 4.36it/s] 90%|████████▉ | 2228/2480 [10:20<01:01, 4.10it/s] 90%|████████▉ | 2229/2480 [10:20<00:56, 4.46it/s] 90%|████████▉ | 2230/2480 [10:20<00:56, 4.41it/s] 90%|████████▉ | 2231/2480 [10:20<00:54, 4.58it/s] 90%|█████████ | 2232/2480 [10:20<00:47, 5.21it/s][INFO|trainer.py:811] 2024-09-04 18:37:56,135 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:37:56,137 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:37:56,137 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:37:56,137 >> Batch size = 8 {'eval_loss': 0.31877079606056213, 'eval_precision': 0.6825145272054939, 'eval_recall': 0.7071702244116037, 'eval_f1': 0.6946236559139785, 'eval_accuracy': 0.9498861047835991, 'eval_runtime': 5.3015, 'eval_samples_per_second': 475.15, 'eval_steps_per_second': 59.417, 'epoch': 8.0} {'loss': 0.0042, 'grad_norm': 0.22283445298671722, 'learning_rate': 9.67741935483871e-06, 'epoch': 8.06} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-2232 [INFO|configuration_utils.py:472] 2024-09-04 18:38:01,591 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-2232/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:38:02,602 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-2232/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:38:02,603 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-2232/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:38:02,604 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-2232/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:38:04,693 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:38:04,693 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json 90%|█████████ | 2233/2480 [10:29<11:33, 2.81s/it] 90%|█████████ | 2234/2480 [10:29<08:17, 2.02s/it] 90%|█████████ | 2235/2480 [10:30<06:15, 1.53s/it] 90%|█████████ | 2236/2480 [10:30<04:36, 1.13s/it] 90%|█████████ | 2237/2480 [10:30<03:29, 1.16it/s] 90%|█████████ | 2238/2480 [10:30<02:41, 1.50it/s] 90%|█████████ | 2239/2480 [10:31<02:12, 1.83it/s] 90%|█████████ | 2240/2480 [10:31<01:48, 2.21it/s] 90%|█████████ | 2241/2480 [10:31<01:30, 2.65it/s] 90%|█████████ | 2242/2480 [10:32<01:29, 2.67it/s] 90%|█████████ | 2243/2480 [10:32<01:18, 3.00it/s] 90%|█████████ | 2244/2480 [10:32<01:12, 3.25it/s] 91%|█████████ | 2245/2480 [10:32<01:06, 3.55it/s] 91%|█████████ | 2246/2480 [10:33<01:05, 3.57it/s] 91%|█████████ | 2247/2480 [10:33<01:11, 3.25it/s] 91%|█████████ | 2248/2480 [10:33<01:06, 3.51it/s] 91%|█████████ | 2249/2480 [10:33<01:05, 3.53it/s] 91%|█████████ | 2250/2480 [10:34<01:01, 3.75it/s] 91%|█████████ | 2251/2480 [10:34<00:58, 3.92it/s] 91%|█████████ | 2252/2480 [10:34<00:54, 4.18it/s] 91%|█████████ | 2253/2480 [10:34<00:53, 4.27it/s] 91%|█████████ | 2254/2480 [10:35<01:09, 3.26it/s] 91%|█████████ | 2255/2480 [10:35<01:05, 3.45it/s] 91%|█████████ | 2256/2480 [10:35<01:01, 3.66it/s] 91%|█████████ | 2257/2480 [10:35<00:58, 3.78it/s] 91%|█████████ | 2258/2480 [10:36<00:56, 3.92it/s] 91%|█████████ | 2259/2480 [10:36<00:53, 4.14it/s] 91%|█████████ | 2260/2480 [10:36<00:52, 4.23it/s] 91%|█████████ | 2261/2480 [10:36<00:47, 4.61it/s] 91%|█████████ | 2262/2480 [10:37<00:51, 4.22it/s] 91%|█████████▏| 2263/2480 [10:37<00:50, 4.31it/s] 91%|█████████▏| 2264/2480 [10:37<00:51, 4.18it/s] 91%|█████████▏| 2265/2480 [10:37<00:49, 4.34it/s] 91%|█████████▏| 2266/2480 [10:37<00:46, 4.57it/s] 91%|█████████▏| 2267/2480 [10:38<00:49, 4.30it/s] 91%|█████████▏| 2268/2480 [10:38<00:52, 4.07it/s] 91%|█████████▏| 2269/2480 [10:38<00:52, 4.02it/s] 92%|█████████▏| 2270/2480 [10:39<00:52, 3.97it/s] 92%|█████████▏| 2271/2480 [10:39<00:48, 4.27it/s] 92%|█████████▏| 2272/2480 [10:39<00:49, 4.22it/s] 92%|█████████▏| 2273/2480 [10:39<00:47, 4.39it/s] 92%|█████████▏| 2274/2480 [10:39<00:46, 4.41it/s] 92%|█████████▏| 2275/2480 [10:40<00:48, 4.23it/s] 92%|█████████▏| 2276/2480 [10:40<00:48, 4.18it/s] 92%|█████████▏| 2277/2480 [10:40<00:48, 4.20it/s] 92%|█████████▏| 2278/2480 [10:40<00:45, 4.45it/s] 92%|█████████▏| 2279/2480 [10:41<00:44, 4.50it/s] 92%|█████████▏| 2280/2480 [10:41<00:43, 4.58it/s] 92%|█████████▏| 2281/2480 [10:41<00:46, 4.29it/s] 92%|█████████▏| 2282/2480 [10:41<00:45, 4.31it/s] 92%|█████████▏| 2283/2480 [10:42<00:47, 4.11it/s] 92%|█████████▏| 2284/2480 [10:42<00:46, 4.18it/s] 92%|█████████▏| 2285/2480 [10:42<00:47, 4.08it/s] 92%|█████████▏| 2286/2480 [10:42<00:43, 4.41it/s] 92%|█████████▏| 2287/2480 [10:42<00:42, 4.53it/s] 92%|█████████▏| 2288/2480 [10:43<00:41, 4.66it/s] 92%|█████████▏| 2289/2480 [10:43<00:43, 4.35it/s] 92%|█████████▏| 2290/2480 [10:43<00:48, 3.90it/s] 92%|█████████▏| 2291/2480 [10:44<00:52, 3.59it/s] 92%|█████████▏| 2292/2480 [10:44<00:50, 3.69it/s] 92%|█████████▏| 2293/2480 [10:44<00:49, 3.79it/s] 92%|█████████▎| 2294/2480 [10:44<00:49, 3.79it/s] 93%|█████████▎| 2295/2480 [10:45<00:49, 3.75it/s] 93%|█████████▎| 2296/2480 [10:45<00:56, 3.27it/s] 93%|█████████▎| 2297/2480 [10:45<00:50, 3.65it/s] 93%|█████████▎| 2298/2480 [10:45<00:52, 3.50it/s] 93%|█████████▎| 2299/2480 [10:46<00:48, 3.77it/s] 93%|█████████▎| 2300/2480 [10:46<00:44, 4.06it/s] 93%|█████████▎| 2301/2480 [10:46<00:43, 4.16it/s] 93%|█████████▎| 2302/2480 [10:47<00:51, 3.48it/s] 93%|█████████▎| 2303/2480 [10:47<00:46, 3.79it/s] 93%|█████████▎| 2304/2480 [10:47<00:47, 3.67it/s] 93%|█████████▎| 2305/2480 [10:47<00:44, 3.96it/s] 93%|█████████▎| 2306/2480 [10:47<00:43, 4.05it/s] 93%|█████████▎| 2307/2480 [10:48<00:45, 3.79it/s] 93%|█████████▎| 2308/2480 [10:48<00:41, 4.13it/s] 93%|█████████▎| 2309/2480 [10:48<00:40, 4.25it/s] 93%|█████████▎| 2310/2480 [10:48<00:38, 4.38it/s] 93%|█████████▎| 2311/2480 [10:49<00:42, 3.94it/s] 93%|█████████▎| 2312/2480 [10:49<00:40, 4.11it/s] 93%|█████████▎| 2313/2480 [10:49<00:39, 4.20it/s] 93%|█████████▎| 2314/2480 [10:49<00:39, 4.17it/s] 93%|█████████▎| 2315/2480 [10:50<00:37, 4.41it/s] 93%|█████████▎| 2316/2480 [10:50<00:43, 3.79it/s] 93%|█████████▎| 2317/2480 [10:50<00:41, 3.96it/s] 93%|█████████▎| 2318/2480 [10:50<00:42, 3.81it/s] 94%|█████████▎| 2319/2480 [10:51<00:39, 4.08it/s] 94%|█████████▎| 2320/2480 [10:51<00:38, 4.17it/s] 94%|█████████▎| 2321/2480 [10:51<00:36, 4.35it/s] 94%|█████████▎| 2322/2480 [10:51<00:39, 3.97it/s] 94%|█████████▎| 2323/2480 [10:52<00:39, 4.01it/s] 94%|█████████▎| 2324/2480 [10:52<00:37, 4.19it/s] 94%|█████████▍| 2325/2480 [10:52<00:34, 4.53it/s] 94%|█████████▍| 2326/2480 [10:52<00:34, 4.45it/s] 94%|█████████▍| 2327/2480 [10:52<00:34, 4.42it/s] 94%|█████████▍| 2328/2480 [10:53<00:33, 4.56it/s] 94%|█████████▍| 2329/2480 [10:53<00:31, 4.74it/s] 94%|█████████▍| 2330/2480 [10:53<00:30, 4.86it/s] 94%|█████████▍| 2331/2480 [10:53<00:33, 4.48it/s] 94%|█████████▍| 2332/2480 [10:54<00:32, 4.58it/s] 94%|█████████▍| 2333/2480 [10:54<00:34, 4.28it/s] 94%|█████████▍| 2334/2480 [10:54<00:35, 4.07it/s] 94%|█████████▍| 2335/2480 [10:54<00:34, 4.18it/s] 94%|█████████▍| 2336/2480 [10:55<00:33, 4.24it/s] 94%|█████████▍| 2337/2480 [10:55<00:32, 4.40it/s] 94%|█████████▍| 2338/2480 [10:55<00:31, 4.51it/s] 94%|█████████▍| 2339/2480 [10:55<00:33, 4.23it/s] 94%|█████████▍| 2340/2480 [10:56<00:38, 3.63it/s] 94%|█████████▍| 2341/2480 [10:56<00:42, 3.27it/s] 94%|█████████▍| 2342/2480 [10:56<00:41, 3.31it/s] 94%|█████████▍| 2343/2480 [10:57<00:39, 3.45it/s] 95%|█████████▍| 2344/2480 [10:57<00:40, 3.39it/s] 95%|█████████▍| 2345/2480 [10:57<00:36, 3.73it/s] 95%|█████████▍| 2346/2480 [10:57<00:40, 3.32it/s] 95%|█████████▍| 2347/2480 [10:58<00:35, 3.78it/s] 95%|█████████▍| 2348/2480 [10:58<00:32, 4.10it/s] 95%|█████████▍| 2349/2480 [10:58<00:30, 4.29it/s] 95%|█████████▍| 2350/2480 [10:58<00:30, 4.29it/s] 95%|█████████▍| 2351/2480 [10:58<00:29, 4.38it/s] 95%|█████████▍| 2352/2480 [10:59<00:28, 4.46it/s] 95%|█████████▍| 2353/2480 [10:59<00:30, 4.17it/s] 95%|█████████▍| 2354/2480 [10:59<00:31, 4.04it/s] 95%|█████████▍| 2355/2480 [10:59<00:30, 4.16it/s] 95%|█████████▌| 2356/2480 [11:00<00:29, 4.15it/s] 95%|█████████▌| 2357/2480 [11:00<00:28, 4.25it/s] 95%|█████████▌| 2358/2480 [11:00<00:29, 4.14it/s] 95%|█████████▌| 2359/2480 [11:00<00:27, 4.34it/s] 95%|█████████▌| 2360/2480 [11:01<00:27, 4.34it/s] 95%|█████████▌| 2361/2480 [11:01<00:28, 4.22it/s] 95%|█████████▌| 2362/2480 [11:01<00:27, 4.33it/s] 95%|█████████▌| 2363/2480 [11:01<00:28, 4.12it/s] 95%|█████████▌| 2364/2480 [11:02<00:27, 4.26it/s] 95%|█████████▌| 2365/2480 [11:02<00:27, 4.24it/s] 95%|█████████▌| 2366/2480 [11:02<00:26, 4.28it/s] 95%|█████████▌| 2367/2480 [11:02<00:26, 4.22it/s] 95%|█████████▌| 2368/2480 [11:03<00:28, 3.94it/s] 96%|█████████▌| 2369/2480 [11:03<00:29, 3.83it/s] 96%|█████████▌| 2370/2480 [11:03<00:27, 3.97it/s] 96%|█████████▌| 2371/2480 [11:03<00:29, 3.75it/s] 96%|█████████▌| 2372/2480 [11:04<00:28, 3.78it/s] 96%|█████████▌| 2373/2480 [11:04<00:26, 4.10it/s] 96%|█████████▌| 2374/2480 [11:04<00:25, 4.22it/s] 96%|█████████▌| 2375/2480 [11:04<00:24, 4.35it/s] 96%|█████████▌| 2376/2480 [11:04<00:22, 4.58it/s] 96%|█████████▌| 2377/2480 [11:05<00:23, 4.37it/s] 96%|█████████▌| 2378/2480 [11:05<00:23, 4.43it/s] 96%|█████████▌| 2379/2480 [11:05<00:26, 3.88it/s] 96%|█████████▌| 2380/2480 [11:05<00:24, 4.03it/s] 96%|█████████▌| 2381/2480 [11:06<00:24, 4.09it/s] 96%|█████████▌| 2382/2480 [11:06<00:24, 3.94it/s] 96%|█████████▌| 2383/2480 [11:06<00:26, 3.62it/s] 96%|█████████▌| 2384/2480 [11:07<00:25, 3.84it/s] 96%|█████████▌| 2385/2480 [11:07<00:22, 4.17it/s] 96%|█████████▌| 2386/2480 [11:07<00:23, 4.02it/s] 96%|█████████▋| 2387/2480 [11:07<00:22, 4.10it/s] 96%|█████████▋| 2388/2480 [11:07<00:22, 4.10it/s] 96%|█████████▋| 2389/2480 [11:08<00:20, 4.40it/s] 96%|█████████▋| 2390/2480 [11:08<00:20, 4.30it/s] 96%|█████████▋| 2391/2480 [11:08<00:23, 3.83it/s] 96%|█████████▋| 2392/2480 [11:09<00:24, 3.61it/s] 96%|█████████▋| 2393/2480 [11:09<00:22, 3.92it/s] 97%|█████████▋| 2394/2480 [11:09<00:21, 3.99it/s] 97%|█████████▋| 2395/2480 [11:09<00:20, 4.18it/s] 97%|█████████▋| 2396/2480 [11:10<00:22, 3.74it/s] 97%|█████████▋| 2397/2480 [11:10<00:20, 4.01it/s] 97%|█████████▋| 2398/2480 [11:10<00:19, 4.19it/s] 97%|█████████▋| 2399/2480 [11:10<00:19, 4.21it/s] 97%|█████████▋| 2400/2480 [11:10<00:19, 4.00it/s] 97%|█████████▋| 2401/2480 [11:11<00:20, 3.89it/s] 97%|█████████▋| 2402/2480 [11:11<00:20, 3.90it/s] 97%|█████████▋| 2403/2480 [11:11<00:18, 4.11it/s] 97%|█████████▋| 2404/2480 [11:11<00:18, 4.09it/s] 97%|█████████▋| 2405/2480 [11:12<00:21, 3.54it/s] 97%|█████████▋| 2406/2480 [11:12<00:20, 3.64it/s] 97%|█████████▋| 2407/2480 [11:12<00:22, 3.23it/s] 97%|█████████▋| 2408/2480 [11:13<00:19, 3.61it/s] 97%|█████████▋| 2409/2480 [11:13<00:18, 3.83it/s] 97%|█████████▋| 2410/2480 [11:13<00:17, 4.10it/s] 97%|█████████▋| 2411/2480 [11:13<00:17, 4.01it/s] 97%|█████████▋| 2412/2480 [11:14<00:15, 4.29it/s] 97%|█████████▋| 2413/2480 [11:14<00:16, 4.13it/s] 97%|█████████▋| 2414/2480 [11:14<00:15, 4.29it/s] 97%|█████████▋| 2415/2480 [11:14<00:14, 4.44it/s] 97%|█████████▋| 2416/2480 [11:14<00:14, 4.36it/s] 97%|█████████▋| 2417/2480 [11:15<00:14, 4.37it/s] 98%|█████████▊| 2418/2480 [11:15<00:13, 4.59it/s] 98%|█████████▊| 2419/2480 [11:15<00:14, 4.12it/s] 98%|█████████▊| 2420/2480 [11:15<00:13, 4.42it/s] 98%|█████████▊| 2421/2480 [11:16<00:14, 4.15it/s] 98%|█████████▊| 2422/2480 [11:16<00:14, 4.13it/s] 98%|█████████▊| 2423/2480 [11:16<00:16, 3.54it/s] 98%|█████████▊| 2424/2480 [11:17<00:14, 3.80it/s] 98%|█████████▊| 2425/2480 [11:17<00:14, 3.88it/s] 98%|█████████▊| 2426/2480 [11:17<00:13, 4.15it/s] 98%|█████████▊| 2427/2480 [11:17<00:16, 3.24it/s] 98%|█████████▊| 2428/2480 [11:18<00:14, 3.55it/s] 98%|█████████▊| 2429/2480 [11:18<00:13, 3.74it/s] 98%|█████████▊| 2430/2480 [11:18<00:14, 3.54it/s] 98%|█████████▊| 2431/2480 [11:18<00:12, 3.87it/s] 98%|█████████▊| 2432/2480 [11:19<00:12, 3.77it/s] 98%|█████████▊| 2433/2480 [11:19<00:13, 3.46it/s] 98%|█████████▊| 2434/2480 [11:19<00:11, 3.87it/s] 98%|█████████▊| 2435/2480 [11:19<00:11, 4.06it/s] 98%|█████████▊| 2436/2480 [11:20<00:11, 3.98it/s] 98%|█████████▊| 2437/2480 [11:20<00:10, 4.17it/s] 98%|█████████▊| 2438/2480 [11:20<00:09, 4.22it/s] 98%|█████████▊| 2439/2480 [11:20<00:09, 4.51it/s] 98%|█████████▊| 2440/2480 [11:21<00:09, 4.24it/s] 98%|█████████▊| 2441/2480 [11:21<00:10, 3.67it/s] 98%|█████████▊| 2442/2480 [11:21<00:09, 4.10it/s] 99%|█████████▊| 2443/2480 [11:21<00:08, 4.43it/s] 99%|█████████▊| 2444/2480 [11:22<00:08, 4.35it/s] 99%|█████████▊| 2445/2480 [11:22<00:07, 4.39it/s] 99%|█████████▊| 2446/2480 [11:22<00:07, 4.59it/s] 99%|█████████▊| 2447/2480 [11:22<00:07, 4.49it/s] 99%|█████████▊| 2448/2480 [11:22<00:07, 4.44it/s] 99%|█████████▉| 2449/2480 [11:23<00:06, 4.45it/s] 99%|█████████▉| 2450/2480 [11:23<00:07, 4.28it/s] 99%|█████████▉| 2451/2480 [11:23<00:06, 4.29it/s] 99%|█████████▉| 2452/2480 [11:23<00:06, 4.09it/s] 99%|█████████▉| 2453/2480 [11:24<00:06, 4.37it/s] 99%|█████████▉| 2454/2480 [11:24<00:06, 4.12it/s] 99%|█████████▉| 2455/2480 [11:24<00:05, 4.17it/s] 99%|█████████▉| 2456/2480 [11:24<00:05, 4.01it/s] 99%|█████████▉| 2457/2480 [11:25<00:05, 4.14it/s] 99%|█████████▉| 2458/2480 [11:25<00:05, 4.21it/s] 99%|█████████▉| 2459/2480 [11:25<00:05, 4.08it/s] 99%|█████████▉| 2460/2480 [11:25<00:05, 3.90it/s] 99%|█████████▉| 2461/2480 [11:26<00:05, 3.59it/s] 99%|█████████▉| 2462/2480 [11:26<00:04, 3.90it/s] 99%|█████████▉| 2463/2480 [11:26<00:04, 3.62it/s] 99%|█████████▉| 2464/2480 [11:26<00:04, 3.71it/s] 99%|█████████▉| 2465/2480 [11:27<00:03, 3.89it/s] 99%|█████████▉| 2466/2480 [11:27<00:03, 3.90it/s] 99%|█████████▉| 2467/2480 [11:27<00:03, 3.88it/s] 100%|█████████▉| 2468/2480 [11:27<00:02, 4.15it/s] 100%|█████████▉| 2469/2480 [11:28<00:02, 3.98it/s] 100%|█████████▉| 2470/2480 [11:28<00:02, 4.14it/s] 100%|█████████▉| 2471/2480 [11:28<00:02, 3.69it/s] 100%|█████████▉| 2472/2480 [11:28<00:01, 4.11it/s] 100%|█████████▉| 2473/2480 [11:29<00:01, 4.49it/s] 100%|█████████▉| 2474/2480 [11:29<00:01, 4.31it/s] 100%|█████████▉| 2475/2480 [11:29<00:01, 4.47it/s] 100%|█████████▉| 2476/2480 [11:29<00:00, 4.60it/s] 100%|█████████▉| 2477/2480 [11:30<00:00, 4.38it/s] 100%|█████████▉| 2478/2480 [11:30<00:00, 4.36it/s] 100%|█████████▉| 2479/2480 [11:30<00:00, 4.12it/s] 100%|██████████| 2480/2480 [11:30<00:00, 4.19it/s][INFO|trainer.py:3503] 2024-09-04 18:39:06,075 >> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-2480 [INFO|configuration_utils.py:472] 2024-09-04 18:39:06,077 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-2480/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:39:07,086 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-2480/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:39:07,087 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-2480/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:39:07,088 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-2480/special_tokens_map.json [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:39:12,383 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:39:12,383 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json [INFO|trainer.py:811] 2024-09-04 18:39:12,431 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:39:12,433 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:39:12,433 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:39:12,433 >> Batch size = 8 {'eval_loss': 0.3295721411705017, 'eval_precision': 0.6808953669963561, 'eval_recall': 0.715927750410509, 'eval_f1': 0.6979722518676629, 'eval_accuracy': 0.9494529821296801, 'eval_runtime': 5.4512, 'eval_samples_per_second': 462.102, 'eval_steps_per_second': 57.786, 'epoch': 9.0} 0%| | 0/315 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output/checkpoint-2480 [INFO|configuration_utils.py:472] 2024-09-04 18:39:18,034 >> Configuration saved in /content/dissertation/scripts/ner/output/checkpoint-2480/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:39:19,381 >> Model weights saved in /content/dissertation/scripts/ner/output/checkpoint-2480/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:39:19,384 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/checkpoint-2480/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:39:19,384 >> Special tokens file saved in /content/dissertation/scripts/ner/output/checkpoint-2480/special_tokens_map.json [INFO|trainer.py:2394] 2024-09-04 18:39:21,164 >> Training completed. Do not forget to share your model on huggingface.co/models =) [INFO|trainer.py:2632] 2024-09-04 18:39:21,164 >> Loading best model from /content/dissertation/scripts/ner/output/checkpoint-1736 (score: 0.6985413290113451). 100%|██████████| 2480/2480 [11:46<00:00, 4.19it/s] 100%|██████████| 2480/2480 [11:46<00:00, 3.51it/s] [INFO|trainer.py:4283] 2024-09-04 18:39:21,354 >> Waiting for the current checkpoint push to be finished, this might take a couple of minutes. [INFO|trainer.py:3503] 2024-09-04 18:39:48,421 >> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-04 18:39:48,422 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:39:49,752 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:39:49,753 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:39:49,753 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json [INFO|trainer.py:3503] 2024-09-04 18:39:49,800 >> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-04 18:39:49,801 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:39:51,135 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:39:51,136 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:39:51,137 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json {'eval_loss': 0.33284899592399597, 'eval_precision': 0.6814159292035398, 'eval_recall': 0.7164750957854407, 'eval_f1': 0.6985058697972252, 'eval_accuracy': 0.9498861047835991, 'eval_runtime': 5.5975, 'eval_samples_per_second': 450.025, 'eval_steps_per_second': 56.276, 'epoch': 10.0} {'train_runtime': 706.0488, 'train_samples_per_second': 224.46, 'train_steps_per_second': 3.513, 'train_loss': 0.03527186407196906, 'epoch': 10.0} events.out.tfevents.1725474455.a5c501872057.1590.0: 0%| | 0.00/11.1k [00:00> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:39:59,627 >> ***** Running Evaluation ***** [INFO|trainer.py:3821] 2024-09-04 18:39:59,627 >> Num examples = 2519 [INFO|trainer.py:3824] 2024-09-04 18:39:59,627 >> Batch size = 8 0%| | 0/315 [00:00> The following columns in the test set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, id, tokens. If ner_tags, id, tokens are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message. [INFO|trainer.py:3819] 2024-09-04 18:40:05,194 >> ***** Running Prediction ***** [INFO|trainer.py:3821] 2024-09-04 18:40:05,194 >> Num examples = 4047 [INFO|trainer.py:3824] 2024-09-04 18:40:05,194 >> Batch size = 8 0%| | 0/506 [00:00> Saving model checkpoint to /content/dissertation/scripts/ner/output [INFO|configuration_utils.py:472] 2024-09-04 18:40:14,233 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json [INFO|modeling_utils.py:2799] 2024-09-04 18:40:15,628 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors [INFO|tokenization_utils_base.py:2684] 2024-09-04 18:40:15,629 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json [INFO|tokenization_utils_base.py:2693] 2024-09-04 18:40:15,629 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json ***** predict metrics ***** predict_accuracy = 0.9466 predict_f1 = 0.6937 predict_loss = 0.3369 predict_precision = 0.6945 predict_recall = 0.693 predict_runtime = 0:00:08.87 predict_samples_per_second = 455.971 predict_steps_per_second = 57.01 events.out.tfevents.1725475205.a5c501872057.1590.1: 0%| | 0.00/560 [00:00