File size: 33,442 Bytes
a184792
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.17.0
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
2024-06-26 07:16:28,967 INFO    MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
2024-06-26 07:16:28,967 ERROR   MainThread:34 [wandb_setup.py:_flush():78] error in wandb.init()
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_34/2014566126.py", line 10, in <module>
    trainer.train()
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train
    return inner_training_loop(
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2147, in _inner_training_loop
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 454, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 498, in call_event
    result = getattr(callback, event)(
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 773, in on_train_begin
    self.setup(args, state, model, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 746, in setup
    self._wandb.init(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1178, in init
    wandb._sentry.reraise(e)
  File "/opt/conda/lib/python3.10/site-packages/wandb/analytics/sentry.py", line 155, in reraise
    raise exc.with_traceback(sys.exc_info()[2])
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1163, in init
    wi.setup(kwargs)
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 300, in setup
    wandb_login._login(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 334, in _login
    wlogin.prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 256, in prompt_api_key
    key, status = self._prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 236, in _prompt_api_key
    key = apikey.prompt_api_key(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/lib/apikey.py", line 151, in prompt_api_key
    key = input_callback(api_ask).strip()
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 164, in prompt
    value = prompt_func(prompt)
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 147, in prompt_func
    raise Abort() from None
click.exceptions.Abort
2024-06-26 07:16:28,968 ERROR   MainThread:34 [wandb_setup.py:_flush():78] error in wandb.init()
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_34/4032920361.py", line 1, in <module>
    trainer.train()
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train
    return inner_training_loop(
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2147, in _inner_training_loop
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 454, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 498, in call_event
    result = getattr(callback, event)(
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 773, in on_train_begin
    self.setup(args, state, model, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 746, in setup
    self._wandb.init(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1178, in init
    wandb._sentry.reraise(e)
  File "/opt/conda/lib/python3.10/site-packages/wandb/analytics/sentry.py", line 155, in reraise
    raise exc.with_traceback(sys.exc_info()[2])
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1163, in init
    wi.setup(kwargs)
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 300, in setup
    wandb_login._login(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 334, in _login
    wlogin.prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 256, in prompt_api_key
    key, status = self._prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 236, in _prompt_api_key
    key = apikey.prompt_api_key(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/lib/apikey.py", line 151, in prompt_api_key
    key = input_callback(api_ask).strip()
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 164, in prompt
    value = prompt_func(prompt)
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 147, in prompt_func
    raise Abort() from None
click.exceptions.Abort
2024-06-26 07:16:28,968 ERROR   MainThread:34 [wandb_setup.py:_flush():78] error in wandb.init()
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_34/4032920361.py", line 1, in <module>
    trainer.train()
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train
    return inner_training_loop(
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2147, in _inner_training_loop
    self.control = self.callback_handler.on_train_begin(args, self.state, self.control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 454, in on_train_begin
    return self.call_event("on_train_begin", args, state, control)
  File "/opt/conda/lib/python3.10/site-packages/transformers/trainer_callback.py", line 498, in call_event
    result = getattr(callback, event)(
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 773, in on_train_begin
    self.setup(args, state, model, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/integrations/integration_utils.py", line 746, in setup
    self._wandb.init(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1178, in init
    wandb._sentry.reraise(e)
  File "/opt/conda/lib/python3.10/site-packages/wandb/analytics/sentry.py", line 155, in reraise
    raise exc.with_traceback(sys.exc_info()[2])
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1163, in init
    wi.setup(kwargs)
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 300, in setup
    wandb_login._login(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 334, in _login
    wlogin.prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 256, in prompt_api_key
    key, status = self._prompt_api_key()
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/wandb_login.py", line 236, in _prompt_api_key
    key = apikey.prompt_api_key(
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/lib/apikey.py", line 151, in prompt_api_key
    key = input_callback(api_ask).strip()
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 164, in prompt
    value = prompt_func(prompt)
  File "/opt/conda/lib/python3.10/site-packages/click/termui.py", line 147, in prompt_func
    raise Abort() from None
click.exceptions.Abort
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:_log_setup():520] Logging user logs to /kaggle/working/wandb/run-20240626_071628-smnm2aje/logs/debug.log
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:_log_setup():521] Logging internal logs to /kaggle/working/wandb/run-20240626_071628-smnm2aje/logs/debug-internal.log
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:_jupyter_setup():466] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x78720aaa5b40>
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:init():560] calling init triggers
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:init():567] wandb.init called with sweep_config: {}
config: {}
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:init():610] starting backend
2024-06-26 07:16:28,969 INFO    MainThread:34 [wandb_init.py:init():614] setting up manager
2024-06-26 07:16:28,971 INFO    MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2024-06-26 07:16:28,977 INFO    MainThread:34 [wandb_init.py:init():622] backend started and connected
2024-06-26 07:16:28,989 INFO    MainThread:34 [wandb_run.py:_label_probe_notebook():1328] probe notebook
2024-06-26 07:16:29,303 INFO    MainThread:34 [wandb_init.py:init():711] updated telemetry
2024-06-26 07:16:29,306 INFO    MainThread:34 [wandb_init.py:init():744] communicating run to backend with 90.0 second timeout
2024-06-26 07:16:29,429 INFO    MainThread:34 [wandb_run.py:_on_init():2396] communicating current version
2024-06-26 07:16:29,507 INFO    MainThread:34 [wandb_run.py:_on_init():2405] got version response upgrade_message: "wandb version 0.17.3 is available!  To upgrade, please run:\n $ pip install wandb --upgrade"

2024-06-26 07:16:29,507 INFO    MainThread:34 [wandb_init.py:init():795] starting run threads in backend
2024-06-26 07:16:45,662 INFO    MainThread:34 [wandb_run.py:_console_start():2374] atexit reg
2024-06-26 07:16:45,662 INFO    MainThread:34 [wandb_run.py:_redirect():2229] redirect: wrap_raw
2024-06-26 07:16:45,662 INFO    MainThread:34 [wandb_run.py:_redirect():2294] Wrapping output streams.
2024-06-26 07:16:45,663 INFO    MainThread:34 [wandb_run.py:_redirect():2319] Redirects installed.
2024-06-26 07:16:45,672 INFO    MainThread:34 [wandb_init.py:init():838] run started, returning control to user process
2024-06-26 07:16:45,678 INFO    MainThread:34 [wandb_run.py:_config_callback():1376] config_cb None None {'vocab_size': 32000, 'max_position_embeddings': 32768, 'hidden_size': 4096, 'intermediate_size': 14336, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'sliding_window': None, 'num_key_value_heads': 8, 'hidden_act': 'silu', 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 1000000.0, 'attention_dropout': 0.0, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['MistralForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 0, 'eos_token_id': 2, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', 'transformers_version': '4.41.2', 'model_type': 'mistral', 'pretraining_tp': 1, 'quantization_config': {'quant_method': 'QuantizationMethod.GPTQ', 'bits': 4, 'tokenizer': None, 'dataset': None, 'group_size': 128, 'damp_percent': 0.1, 'desc_act': True, 'sym': True, 'true_sequential': True, 'use_cuda_fp16': False, 'model_seqlen': None, 'block_name_to_quantize': None, 'module_name_preceding_first_block': None, 'batch_size': 1, 'pad_token_id': None, 'use_exllama': True, 'max_input_length': None, 'exllama_config': {'version': 'ExllamaVersion.ONE'}, 'cache_block_outputs': True, 'modules_in_block_to_quantize': None}, 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 10, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Jun26_07-16-15_adc95cf38b20', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': None, 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': False}
2024-06-26 07:18:26,851 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:26,851 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:50,892 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:50,915 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:50,915 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:51,909 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:51,913 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:51,913 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:52,853 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:52,855 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:52,855 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:53,667 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:53,670 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:53,670 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:54,691 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:54,756 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:54,756 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:55,617 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:55,682 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:55,682 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:56,512 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:56,513 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:56,513 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:57,999 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:58,001 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:58,001 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:18:58,997 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:18:59,009 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:18:59,009 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:00,090 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:00,092 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:00,092 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:01,722 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:01,723 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:01,723 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:03,098 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:03,117 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:03,117 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:04,117 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:04,212 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:04,212 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:05,065 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:05,097 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:05,097 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:06,209 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:06,220 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 07:19:06,220 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 07:19:07,300 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 07:19:07,961 INFO    MainThread:34 [wandb_run.py:_config_callback():1376] config_cb None None {'vocab_size': 32000, 'max_position_embeddings': 32768, 'hidden_size': 4096, 'intermediate_size': 14336, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'sliding_window': None, 'num_key_value_heads': 8, 'hidden_act': 'silu', 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 1000000.0, 'attention_dropout': 0.0, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['MistralForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 0, 'eos_token_id': 2, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', 'transformers_version': '4.41.2', 'model_type': 'mistral', 'pretraining_tp': 1, 'quantization_config': {'quant_method': 'QuantizationMethod.GPTQ', 'bits': 4, 'tokenizer': None, 'dataset': None, 'group_size': 128, 'damp_percent': 0.1, 'desc_act': True, 'sym': True, 'true_sequential': True, 'use_cuda_fp16': False, 'model_seqlen': None, 'block_name_to_quantize': None, 'module_name_preceding_first_block': None, 'batch_size': 1, 'pad_token_id': None, 'use_exllama': True, 'max_input_length': None, 'exllama_config': {'version': 'ExllamaVersion.ONE'}, 'cache_block_outputs': True, 'modules_in_block_to_quantize': None}, 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'eval_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 4, 'per_device_eval_batch_size': 4, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 10, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Jun26_07-19-05_adc95cf38b20', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'restore_callback_states_from_checkpoint': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': None, 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None, 'batch_eval_metrics': False}
2024-06-26 08:35:48,386 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:35:48,387 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:36:11,677 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:38:19,575 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:38:19,575 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:39:55,100 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:39:55,103 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:39:55,103 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:40:32,017 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:43:02,593 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:43:02,594 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:45:10,959 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:45:10,963 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:45:10,963 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:45:13,384 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:47:43,994 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:47:43,994 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:49:58,760 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:53:33,771 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:53:33,771 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:54:32,055 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:54:32,059 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:54:32,059 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 08:54:32,754 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 08:58:07,756 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 08:58:07,756 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:12:46,632 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:12:46,669 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:12:46,669 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:16:22,037 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:16:22,038 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:16:22,038 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:16:35,228 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:16:36,438 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:16:36,438 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:18:25,016 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:18:26,213 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:18:26,214 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:21:21,819 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:21:21,842 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:21:21,842 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:21:36,632 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend
2024-06-26 09:21:36,633 INFO    MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2024-06-26 09:21:36,634 INFO    MainThread:34 [wandb_init.py:_pause_backend():431] pausing backend
2024-06-26 09:21:37,142 INFO    MainThread:34 [wandb_init.py:_resume_backend():436] resuming backend