File size: 5,620 Bytes
17ca27d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
2025-01-04 11:25:30,962 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Current SDK version is 0.18.3
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Configure stats pid to 928377
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Loading settings from /home/align-anything/.config/wandb/settings
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Loading settings from /data/align-anything/hantao/align-anything/scripts/wandb/settings
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Loading settings from environment variables: {'api_key': '***REDACTED***', 'mode': 'online'}
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Applying setup settings: {'mode': 'online', '_disable_service': None}
2025-01-04 11:25:30,963 WARNING MainThread:928377 [wandb_setup.py:_flush():79] Could not find program at -m align_anything.trainers.text_image_to_text_image.dpo
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Inferring run settings from compute environment: {'program_relpath': None, 'program': '-m align_anything.trainers.text_image_to_text_image.dpo'}
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_setup.py:_flush():79] Applying login settings: {}
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_init.py:_log_setup():532] Logging user logs to /data/align-anything/hantao/align-anything/outputs/mm_interp/AA_preference_l0_new_step10/q0_10_preference/wandb/run-20250104_112530-4akkv1ur/logs/debug.log
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_init.py:_log_setup():533] Logging internal logs to /data/align-anything/hantao/align-anything/outputs/mm_interp/AA_preference_l0_new_step10/q0_10_preference/wandb/run-20250104_112530-4akkv1ur/logs/debug-internal.log
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_init.py:init():617] calling init triggers
2025-01-04 11:25:30,963 INFO    MainThread:928377 [wandb_init.py:init():624] wandb.init called with sweep_config: {}
config: {'train_cfgs': {'ds_cfgs': 'ds_z3_config.json', 'epochs': 3.0, 'seed': 42, 'per_device_train_batch_size': 4.0, 'per_device_eval_batch_size': 4.0, 'gradient_accumulation_steps': 2.0, 'gradient_checkpointing': True, 'learning_rate': 1e-06, 'lr_scheduler_type': 'cosine', 'lr_warmup_ratio': 0.03, 'weight_decay': 0.01, 'adam_betas': [0.9, 0.95], 'bf16': True, 'fp16': False, 'eval_strategy': 'epoch', 'eval_interval': 10, 'regularization': 0.001, 'scale_coeff': 0.1, 'freeze_mm_proj': True, 'freeze_vision_tower': False, 'freeze_language_model': True}, 'data_cfgs': {'train_datasets': '/data/align-anything/hantao/data/mm_interp/AA_preference_l0_new_step10/tokenized', 'train_template': 'Chameleon_preference', 'train_size': None, 'train_split': 'train', 'train_subset': None, 'train_data_files': 'q0_10_preference.pt', 'train_optional_args': [], 'eval_datasets': None, 'eval_template': None, 'eval_size': None, 'eval_split': None, 'eval_subset': None, 'eval_data_files': None, 'eval_optional_args': []}, 'logger_cfgs': {'log_type': 'wandb', 'log_project': 'align-anything', 'log_run_name': 'dpo', 'output_dir': '/data/align-anything/hantao/align-anything/outputs/mm_interp/AA_preference_l0_new_step10/q0_10_preference', 'cache_dir': None, 'save_interval': 400.0}, 'model_cfgs': {'model_name_or_path': '/data/align-anything/hantao/models/chameleon-7b', 'trust_remote_code': True, 'model_max_length': 4096}, 'special_tokens': None}
2025-01-04 11:25:30,964 INFO    MainThread:928377 [wandb_init.py:init():667] starting backend
2025-01-04 11:25:30,964 INFO    MainThread:928377 [wandb_init.py:init():671] sending inform_init request
2025-01-04 11:25:30,970 INFO    MainThread:928377 [backend.py:_multiprocessing_setup():104] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2025-01-04 11:25:30,970 INFO    MainThread:928377 [wandb_init.py:init():684] backend started and connected
2025-01-04 11:25:30,973 INFO    MainThread:928377 [wandb_init.py:init():779] updated telemetry
2025-01-04 11:25:31,028 INFO    MainThread:928377 [wandb_init.py:init():812] communicating run to backend with 90.0 second timeout
2025-01-04 11:25:31,681 INFO    MainThread:928377 [wandb_init.py:init():863] starting run threads in backend
2025-01-04 11:25:32,163 INFO    MainThread:928377 [wandb_run.py:_console_start():2465] atexit reg
2025-01-04 11:25:32,163 INFO    MainThread:928377 [wandb_run.py:_redirect():2313] redirect: wrap_raw
2025-01-04 11:25:32,163 INFO    MainThread:928377 [wandb_run.py:_redirect():2378] Wrapping output streams.
2025-01-04 11:25:32,163 INFO    MainThread:928377 [wandb_run.py:_redirect():2403] Redirects installed.
2025-01-04 11:25:32,172 INFO    MainThread:928377 [wandb_init.py:init():907] run started, returning control to user process
2025-01-04 12:20:28,497 INFO    MainThread:928377 [wandb_run.py:_finish():2164] finishing run htlou/align-anything/4akkv1ur
2025-01-04 12:20:28,499 INFO    MainThread:928377 [wandb_run.py:_atexit_cleanup():2428] got exitcode: 0
2025-01-04 12:20:28,500 INFO    MainThread:928377 [wandb_run.py:_restore():2410] restore
2025-01-04 12:20:28,500 INFO    MainThread:928377 [wandb_run.py:_restore():2416] restore done
2025-01-04 12:20:33,109 INFO    MainThread:928377 [wandb_run.py:_footer_history_summary_info():4049] rendering history
2025-01-04 12:20:33,111 INFO    MainThread:928377 [wandb_run.py:_footer_history_summary_info():4081] rendering summary
2025-01-04 12:20:33,122 INFO    MainThread:928377 [wandb_run.py:_footer_sync_info():4008] logging synced files