W0217 19:20:22.279000 35449 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 35449 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 35449 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 35449 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 44026 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 44026 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 44026 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 44026 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 1829281 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 1829281 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 1829281 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 1829281 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 33190 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 33190 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 33190 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 33190 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 40069 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 40069 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 40069 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 40069 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 2501994 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 2501994 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 2501994 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 2501994 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 37851 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 37851 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 37851 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 37851 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 31269 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 31269 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 31269 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 31269 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 45067 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 45067 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 45067 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 45067 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 42089 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 42089 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 42089 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 42089 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 41382 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 41382 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 41382 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 41382 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 45670 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.279000 45670 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.279000 45670 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.279000 45670 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.281000 239443 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.281000 239443 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.281000 239443 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.281000 239443 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.287000 24658 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.287000 24658 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.287000 24658 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.287000 24658 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.287000 1187194 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:22.287000 1187194 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:22.287000 1187194 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:22.287000 1187194 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:24.587000 2401523 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] 
W0217 19:20:24.587000 2401523 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
W0217 19:20:24.587000 2401523 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
W0217 19:20:24.587000 2401523 .local/lib/python3.10/site-packages/torch/distributed/run.py:793] *****************************************
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
PyTorch: setting up devices
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]loading configuration file config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/config.json
You are using a model of type qwen2_5_vl to instantiate a model of type llava_qwen. This is not supported for all configurations of models and can yield errors.
Model config LlavaQwenConfig {
  "architectures": [
    "Qwen2_5_VLForConditionalGeneration"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "image_token_id": 151655,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 128000,
  "max_window_layers": 28,
  "model_type": "llava_qwen",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": {
    "mrope_section": [
      16,
      24,
      24
    ],
    "rope_type": "default",
    "type": "default"
  },
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0.dev0",
  "use_cache": true,
  "use_sliding_window": false,
  "video_token_id": 151656,
  "vision_config": {
    "hidden_size": 1280,
    "in_chans": 3,
    "model_type": "qwen2_5_vl",
    "spatial_patch_size": 14,
    "tokens_per_second": 2
  },
  "vision_end_token_id": 151653,
  "vision_start_token_id": 151652,
  "vision_token_id": 151654,
  "vocab_size": 152064
}

loading weights file model.safetensors from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/model.safetensors.index.json
Instantiating LlavaQwenForCausalLM model under default dtype torch.bfloat16.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Generate config GenerationConfig {
  "bos_token_id": 151643,
  "eos_token_id": 151645
}

Instantiating Qwen2_5_VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.57it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.09it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.48it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.63it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.53it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.33it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.55it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.38it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.19it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.34it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.13it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.96it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.99it/s]Loading checkpoint shards:  20%|██   Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.25it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.28it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.05it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.11it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  2.00it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.12it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.10it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.70it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.32it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.37it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.11it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.14it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.13it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.05it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.13it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.02it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.13it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.05it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.84it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.98it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.05it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.04it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.90it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.83it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.00it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.83it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.81it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.99it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.70it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.51it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.03it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.23it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.94it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.04it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.99it/s]Loading checkpoint shards:  2Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.61it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.88it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.60it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.02it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.78it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.92it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.96it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.80it/s]Loading checkpoinLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.63it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.71it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.33it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.50it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.03it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.30it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.22it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.30it/s]Loading checkpoinLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.00it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.22it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.28it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.07it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.69it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.72it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.75it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.48it/s]Loading checkpoinLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.17it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.60it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.07it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.96it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.85it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.71it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.32it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.11it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.04it/s]LLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.02it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:01<00:04,  1.05s/it]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.85it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.97it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.91it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.87it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.61it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.93it/s]Loading checkpoinLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.92it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.41it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.25it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.03it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.19it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:01<00:04,  1.05s/it]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.19it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.96it/s]Loading checkpoinLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.48it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.81it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.55it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:03,  1.23it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.86it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.02it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.37it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  3.13it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.37it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.14it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.67it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.48it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.39it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.05it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.82it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.01it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.93it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.24it/s]LLoading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.81it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.37it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.41it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.14it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.48it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.97it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:01,  2.44it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.95it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.81it/s]L0%|██        | 1/5 [00:00<00:01,  2.13it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.99it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.82it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.94it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.80it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.87it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.74it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.88it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.87it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.16it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.14it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,0%|██        | 1/5 [00:00<00:02,  1.87it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.76it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.58it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.66it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.57it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.70it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.56it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.51it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.70it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.02it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.99it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.01it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,t shards:  40%|████      | 2/5 [00:00<00:00,  3.14it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.99it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.46it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.05it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.67it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.96it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.87it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.85it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.26it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.17it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.84it/s]Loading checkpoint shards:  60%|██████   t shards:  40%|████      | 2/5 [00:00<00:00,  3.19it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.10it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.80it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.21it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.26it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.27it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.37it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.28it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.23it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.11it/s]Loading checkpoint shards:  60%|██████   0%|██        | 1/5 [00:00<00:01,  2.13it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.83it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.72it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.81it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.75it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.74it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.57it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.68it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.76it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.04it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,t shards:  40%|████      | 2/5 [00:00<00:00,  3.28it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.98it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.43it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.27it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.10it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.82it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.02it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.16it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.33it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.44it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.44it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.35it/s]Loading checkpoint shards:  60%|██████   t shards:  40%|████      | 2/5 [00:00<00:01,  2.64it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.73it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.81it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.45it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.64it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.61it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.65it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.60it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.88it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.94it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.01it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards:  60%|██████   oading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.89it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.23it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.90it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.19it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.40it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.17it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.20it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.38it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.33it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.31it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.28it/s]Loading checkpoint shards:  60%|�0%|██        | 1/5 [00:00<00:01,  3.15it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.43it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.30it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.40it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.52it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.26it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.25it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.74it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.34it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.41it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.32it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,oading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.04it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.68it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.83it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.90it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.97it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.99it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.61it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.28it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.22it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|�0%|██        | 1/5 [00:00<00:03,  1.15it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.83it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.90it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.72it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.64it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.62it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.88it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.57it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.67it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,0%|██        | 1/5 [00:00<00:02,  1.99it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.96it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.85it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.83it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.77it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.74it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.64it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.64it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.83it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.04it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,t shards:  40%|████      | 2/5 [00:00<00:01,  2.89it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.62it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.87it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.72it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.85it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  1.66it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.65it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.78it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  60%|██████   oading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.96it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.69it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.32it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.70it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.67it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.74it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.49it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.21it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.67it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.01it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.71it/s]Loading checkpoint shards:  60%|�00,  3.45it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:01<00:01,  2.02it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.75it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.03it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.40it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.40it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:00,  3.59it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.50it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.73it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.39it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.52it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.41it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.18it/s]Loading check     | 1/5 [00:00<00:02,  1.92it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:00<00:02,  1.93it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.78it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.87it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.65it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.56it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.53it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.54it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.72it/s]Loading checkpoint shards:  40%|████      | 2/5 [00:00<00:01,  2.49it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.00it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.05it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.87it/s]Loading   2.99it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.91it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.02it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.26it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.94it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.17it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.18it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.26it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.66it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.19it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  80%|█████�  2.99it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.36it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.92it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.90it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.78it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.21it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.15it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.15it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards:  80%|█████�  3.43it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.27it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.19it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.23it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.16it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.48it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.41it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.33it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards:  80%|█████�  3.00it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.95it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.95it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.94it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.31it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.30it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.17it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.17it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.11it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.66it/s]Loading checkpoint shards:  80%|█████�checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.87it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.80it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.80it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.78it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.04it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00  2.29it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.01it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.18it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.63it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.18it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.20it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.17it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  80%|█████�  2.90it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.97it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.88it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.23it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.76it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.17it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.58it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.73it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.26it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
 | 3/5 [00:00<00:00,  3.39it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.31it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.31it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.28it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.38it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.33it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.22it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.20it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.39it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.33it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.35it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.25it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.90it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.59it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
�██  | 4/5 [00:01<00:00,  2.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.98it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.86it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.51it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
 | 3/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.15it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.96it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.32it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.04it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.30it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.25it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.24it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.24it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.20it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.85it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.61it/s]All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.06it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
��█████    | 3/5 [00:01<00:00,  3.11it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.40it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.27it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.70it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.36it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.25it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.36it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.33it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.40it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.21it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.32it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.29it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.60it/s]
��█████    | 3/5 [00:00<00:00,  3.16it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.96it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.11it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.23it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.33it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.30it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.16it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.14it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.19it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.26it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.22it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.77it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.27it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.89it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.59it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.73it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.30it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.84it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.86it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.92it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.66it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.79it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.40it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.80it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.44it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

 | 3/5 [00:01<00:00,  2.97it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.15it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.23it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.23it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.16it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.19it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.54it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.01it/s]Loading checkpoint shardsAll the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
: 100%|██████████| 5/5 [00:01<00:00,  3.76it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.34it/s]
point shards:  60%|██████    | 3/5 [00:00<00:00,  3.69it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.38it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.46it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.39it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.81it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.47it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.26it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.39it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.29it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.92it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.67it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.96it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.67it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.73it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.67it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.18it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
�██  | 4/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.81it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.42it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.87it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.59it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
 | 3/5 [00:00<00:00,  3.20it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.22it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.05it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:00<00:00,  3.16it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.75it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.35it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.26it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.27it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.29it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.19it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.32it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.65it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
�██  | 4/5 [00:01<00:00,  2.98it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.25it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.76it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.35it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.87it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.56it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
��█████    | 3/5 [00:01<00:00,  2.86it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  3.00it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.92it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.28it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.93it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.20it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.24it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.94it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.15it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.05it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.79it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.41it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.38it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.71it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.27it/s]
�██  | 4/5 [00:01<00:00,  3.08it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.36it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.62it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
 | 3/5 [00:01<00:00,  2.93it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.15it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.93it/s]Loading checkpoint shards:  60%|██████    | 3/5 [00:01<00:00,  2.94it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.06it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.10it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.19it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.13it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.51it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.09it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.11it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  3.07it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.64it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.15it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.25it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.85it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.57it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.81it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.42it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.80it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.30it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.89it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.69it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.17it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.71it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.25it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
�██  | 4/5 [00:01<00:00,  3.12it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.86it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.54it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.70it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.73it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.29it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.04it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.71it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.26it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  3.20it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.50it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.75it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.26it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.65it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.20it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.68it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.75it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.75it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.43it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  3.24it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.49it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.67it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.19it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.66it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.79it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.53it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.98it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.80it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.69it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.36it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.11it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.82it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.59it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.62it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.27it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.66it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.33it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.55it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.17it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.63it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.30it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.51it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.08it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.76it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.53it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.55it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.15it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.54it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.15it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.23it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.71it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.47it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.55it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.14it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.69it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.47it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.64it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.36it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.60it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.28it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.75it/s]All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.56it/s]All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.56it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.17it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.57it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.25it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.68it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.45it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.60it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.26it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.54it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.17it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.69it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.11it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.99it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.32it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.59it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.29it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.64it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.39it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.61it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.33it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.74it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.72it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.51it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

:00,  3.05it/s]Loading checkpoint shards:  80%|████████  | 4/5 [00:01<00:00,  2.96it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.56it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.06it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.21it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.67it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.46it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.10it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.10it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.66it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.41it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.74it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.55it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.50it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.19it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.34it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.89it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.50it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.13it/s]
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.56it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.32it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.45it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.08it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.45it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.19it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.47it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.11it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.43it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.03it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.45it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.06it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  3.02it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.37it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.82it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.69it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.50it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.35it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.19it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.12it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.48it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.11it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.54it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.25it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.49it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.15it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.41it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.04it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.53it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.27it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.41it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.01it/s]
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  3.02it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.39it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.52it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.20it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.98it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.31it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.43it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.09it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.41it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.03it/s]
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  3.07it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:02<00:00,  2.48it/s]
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.43it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.09it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.38it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.00it/s]
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.62it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.51it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.58it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.42it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.43it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.15it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.37it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  2.98it/s]
Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.32it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.06it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.47it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.22it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.39it/s]Loading checkpoint shards: 100%|██████████| 5/5 [00:01<00:00,  3.14it/s]
All model checkpoint weights were used when initializing LlavaQwenForCausalLM.

All the weights of LlavaQwenForCausalLM were initialized from the model checkpoint at Qwen/Qwen2.5-VL-7B-Instruct.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlavaQwenForCausalLM for predictions without further training.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file special_tokens_map.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file chat_template.jinja from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file added_tokens.json from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file chat_template.jinja from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file chat_template.jinja from cache at None
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file added_tokens.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file special_tokens_map.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file added_tokens.json from cache at None
loading file chat_template.jinja from cache at None
loading file chat_template.jinja from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file chat_template.jinja from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file chat_template.jinja from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file added_tokens.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file special_tokens_map.json from cache at None
loading file chat_template.jinja from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file chat_template.jinja from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file chat_template.jinja from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file generation_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/generation_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Generate config GenerationConfig {
  "attn_implementation": "flash_attention_2",
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_k": 1,
  "top_p": 0.001
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file chat_template.jinja from cache at None
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file added_tokens.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file special_tokens_map.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file added_tokens.json from cache at None
loading file chat_template.jinja from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file special_tokens_map.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file chat_template.jinja from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file chat_template.jinja from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file special_tokens_map.json from cache at None
loading file added_tokens.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file special_tokens_map.json from cache at None
loading file chat_template.jinja from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file added_tokens.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file added_tokens.json from cache at None
loading file chat_template.jinja from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file chat_template.jinja from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file added_tokens.json from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file special_tokens_map.json from cache at None
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file chat_template.jinja from cache at None
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file added_tokens.json from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file chat_template.jinja from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading configuration file preprocessor_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/preprocessor_config.json
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Image processor Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
loading file vocab.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/vocab.json
loading file merges.txt from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/merges.txt
loading file tokenizer.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer.json
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at None
loading file tokenizer_config.json from cache at /fsx_0/user/zhaojiang/models/hub/models--Qwen--Qwen2.5-VL-7B-Instruct/snapshots/6e6556e8ce728c7b3e438d75ebf04ec93403dc19/tokenizer_config.json
loading file chat_template.jinja from cache at None
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Processor Qwen2_5_VLProcessor:
- image_processor: Qwen2VLImageProcessor {
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "Qwen2VLImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "max_pixels": 12845056,
  "merge_size": 2,
  "min_pixels": 3136,
  "patch_size": 14,
  "processor_class": "Qwen2_5_VLProcessor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "longest_edge": 12845056,
    "shortest_edge": 3136
  },
  "temporal_patch_size": 2
}

- tokenizer: Qwen2TokenizerFast(name_or_path='Qwen/Qwen2.5-VL-7B-Instruct', vocab_size=151643, model_max_length=131072, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
	151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	151657: AddedToken("<tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151658: AddedToken("</tool_call>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151659: AddedToken("<|fim_prefix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151660: AddedToken("<|fim_middle|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151661: AddedToken("<|fim_suffix|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151662: AddedToken("<|fim_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151663: AddedToken("<|repo_name|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
	151664: AddedToken("<|file_sep|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Qwen2_5_VLProcessor"
}

You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embedding dimension will be 151668. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/model/multimodal_encoder/eva_clip/eva_vit.py:622: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(checkpoint_path, map_location=map_location)
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Using custom data configuration default-5e4e9de28fd39dca
Loading Dataset Infos from /home/zhaojiang/.local/lib/python3.10/site-packages/datasets/packaged_modules/webdataset
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Overwrite dataset info from restored data version if exists.
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
Found cached dataset webdataset (/fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f)
Loading Dataset info from /fsx_0/user/zhaojiang/wb/webdataset/default-5e4e9de28fd39dca/0.0.0/e9ef0843eead451e800ef3bd9a9ee86b731520f88aa20be2d598ddfeef5b3f7f
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank2]: Traceback (most recent call last):
[rank2]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank2]:     train(attn_implementation="flash_attention_2")
[rank2]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank2]:     trainer.train(resume_from_checkpoint=True)
[rank2]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank2]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank2]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank2]:     with open(json_path, "r", encoding="utf-8") as f:
[rank2]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank48]: Traceback (most recent call last):
[rank48]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank48]:     train(attn_implementation="flash_attention_2")
[rank48]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank48]:     trainer.train(resume_from_checkpoint=True)
[rank48]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank48]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank48]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank48]:     with open(json_path, "r", encoding="utf-8") as f:
[rank48]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank16]: Traceback (most recent call last):
[rank16]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank16]:     train(attn_implementation="flash_attention_2")
[rank16]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank16]:     trainer.train(resume_from_checkpoint=True)
[rank16]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank16]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank16]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank16]:     with open(json_path, "r", encoding="utf-8") as f:
[rank16]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank42]: Traceback (most recent call last):
[rank42]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank42]:     train(attn_implementation="flash_attention_2")
[rank42]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank42]:     trainer.train(resume_from_checkpoint=True)
[rank42]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank42]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank42]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank42]:     with open(json_path, "r", encoding="utf-8") as f:
[rank42]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank47]: Traceback (most recent call last):
[rank47]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank47]:     train(attn_implementation="flash_attention_2")
[rank47]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank47]:     trainer.train(resume_from_checkpoint=True)
[rank47]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank47]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank47]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank47]:     with open(json_path, "r", encoding="utf-8") as f:
[rank47]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank89]: Traceback (most recent call last):
[rank89]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank89]:     train(attn_implementation="flash_attention_2")
[rank89]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank89]:     trainer.train(resume_from_checkpoint=True)
[rank89]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank89]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank89]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank89]:     with open(json_path, "r", encoding="utf-8") as f:
[rank89]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank94]: Traceback (most recent call last):
[rank94]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank94]:     train(attn_implementation="flash_attention_2")
[rank94]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank94]:     trainer.train(resume_from_checkpoint=True)
[rank94]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank94]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank94]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank94]:     with open(json_path, "r", encoding="utf-8") as f:
[rank94]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank85]: Traceback (most recent call last):
[rank85]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank85]:     train(attn_implementation="flash_attention_2")
[rank85]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank85]:     trainer.train(resume_from_checkpoint=True)
[rank85]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank85]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank85]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank85]:     with open(json_path, "r", encoding="utf-8") as f:
[rank85]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank62]: Traceback (most recent call last):
[rank62]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank62]:     train(attn_implementation="flash_attention_2")
[rank62]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank62]:     trainer.train(resume_from_checkpoint=True)
[rank62]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank62]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank62]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank62]:     with open(json_path, "r", encoding="utf-8") as f:
[rank62]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank76]: Traceback (most recent call last):
[rank76]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank76]:     train(attn_implementation="flash_attention_2")
[rank76]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank76]:     trainer.train(resume_from_checkpoint=True)
[rank76]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank76]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank76]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank76]:     with open(json_path, "r", encoding="utf-8") as f:
[rank76]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank78]: Traceback (most recent call last):
[rank78]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank78]:     train(attn_implementation="flash_attention_2")
[rank78]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank78]:     trainer.train(resume_from_checkpoint=True)
[rank78]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank78]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank78]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank78]:     with open(json_path, "r", encoding="utf-8") as f:
[rank78]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank34]: Traceback (most recent call last):
[rank34]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank34]:     train(attn_implementation="flash_attention_2")
[rank34]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank34]:     trainer.train(resume_from_checkpoint=True)
[rank34]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank34]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank34]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank34]:     with open(json_path, "r", encoding="utf-8") as f:
[rank34]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank46]: Traceback (most recent call last):
[rank46]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank46]:     train(attn_implementation="flash_attention_2")
[rank46]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank46]:     trainer.train(resume_from_checkpoint=True)
[rank46]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank46]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank46]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank46]:     with open(json_path, "r", encoding="utf-8") as f:
[rank46]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank57]: Traceback (most recent call last):
[rank57]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank57]:     train(attn_implementation="flash_attention_2")
[rank57]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank57]:     trainer.train(resume_from_checkpoint=True)
[rank57]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank57]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank57]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank57]:     with open(json_path, "r", encoding="utf-8") as f:
[rank57]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
Using auto half precision backend
[rank88]: Traceback (most recent call last):
[rank88]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank88]:     train(attn_implementation="flash_attention_2")
[rank88]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank88]:     trainer.train(resume_from_checkpoint=True)
[rank88]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank88]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank88]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank88]:     with open(json_path, "r", encoding="utf-8") as f:
[rank88]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank105]: Traceback (most recent call last):
[rank105]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank105]:     train(attn_implementation="flash_attention_2")
[rank105]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank105]:     trainer.train(resume_from_checkpoint=True)
[rank105]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank105]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank105]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank105]:     with open(json_path, "r", encoding="utf-8") as f:
[rank105]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank25]: Traceback (most recent call last):
[rank25]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank25]:     train(attn_implementation="flash_attention_2")
[rank25]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank25]:     trainer.train(resume_from_checkpoint=True)
[rank25]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank25]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank25]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank25]:     with open(json_path, "r", encoding="utf-8") as f:
[rank25]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
Using auto half precision backend
[rank104]: Traceback (most recent call last):
[rank104]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank104]:     train(attn_implementation="flash_attention_2")
[rank104]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank104]:     trainer.train(resume_from_checkpoint=True)
[rank104]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank104]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank104]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank104]:     with open(json_path, "r", encoding="utf-8") as f:
[rank104]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank23]: Traceback (most recent call last):
[rank23]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank23]:     train(attn_implementation="flash_attention_2")
[rank23]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank23]:     trainer.train(resume_from_checkpoint=True)
[rank23]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank23]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank23]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank23]:     with open(json_path, "r", encoding="utf-8") as f:
[rank23]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank116]: Traceback (most recent call last):
[rank116]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank116]:     train(attn_implementation="flash_attention_2")
[rank116]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank116]:     trainer.train(resume_from_checkpoint=True)
[rank116]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank116]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank116]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank116]:     with open(json_path, "r", encoding="utf-8") as f:
[rank116]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank99]: Traceback (most recent call last):
[rank99]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank99]:     train(attn_implementation="flash_attention_2")
[rank99]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank99]:     trainer.train(resume_from_checkpoint=True)
[rank99]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank99]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank99]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank99]:     with open(json_path, "r", encoding="utf-8") as f:
[rank99]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank109]: Traceback (most recent call last):
[rank109]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank109]:     train(attn_implementation="flash_attention_2")
[rank109]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank109]:     trainer.train(resume_from_checkpoint=True)
[rank109]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank109]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank109]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank109]:     with open(json_path, "r", encoding="utf-8") as f:
[rank109]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank112]: Traceback (most recent call last):
[rank112]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank112]:     train(attn_implementation="flash_attention_2")
[rank112]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank112]:     trainer.train(resume_from_checkpoint=True)
[rank112]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank112]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank112]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank112]:     with open(json_path, "r", encoding="utf-8") as f:
[rank112]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank83]: Traceback (most recent call last):
[rank83]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank83]:     train(attn_implementation="flash_attention_2")
[rank83]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank83]:     trainer.train(resume_from_checkpoint=True)
[rank83]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank83]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank83]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank83]:     with open(json_path, "r", encoding="utf-8") as f:
[rank83]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank45]: Traceback (most recent call last):
[rank45]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank45]:     train(attn_implementation="flash_attention_2")
[rank45]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank45]:     trainer.train(resume_from_checkpoint=True)
[rank45]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank45]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank45]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank45]:     with open(json_path, "r", encoding="utf-8") as f:
[rank45]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank28]: Traceback (most recent call last):
[rank28]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank28]:     train(attn_implementation="flash_attention_2")
[rank28]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank28]:     trainer.train(resume_from_checkpoint=True)
[rank28]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank28]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank28]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank28]:     with open(json_path, "r", encoding="utf-8") as f:
[rank28]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank72]: Traceback (most recent call last):
[rank72]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank72]:     train(attn_implementation="flash_attention_2")
[rank72]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank72]:     trainer.train(resume_from_checkpoint=True)
[rank72]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank72]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank72]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank72]:     with open(json_path, "r", encoding="utf-8") as f:
[rank72]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank103]: Traceback (most recent call last):
[rank103]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank103]:     train(attn_implementation="flash_attention_2")
[rank103]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank103]:     trainer.train(resume_from_checkpoint=True)
[rank103]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank103]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank103]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank103]:     with open(json_path, "r", encoding="utf-8") as f:
[rank103]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank21]: Traceback (most recent call last):
[rank21]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank21]:     train(attn_implementation="flash_attention_2")
[rank21]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank21]:     trainer.train(resume_from_checkpoint=True)
[rank21]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank21]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank21]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank21]:     with open(json_path, "r", encoding="utf-8") as f:
[rank21]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank121]: Traceback (most recent call last):
[rank121]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank121]:     train(attn_implementation="flash_attention_2")
[rank121]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank121]:     trainer.train(resume_from_checkpoint=True)
[rank121]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank121]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank121]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank121]:     with open(json_path, "r", encoding="utf-8") as f:
[rank121]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank54]: Traceback (most recent call last):
[rank54]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank54]:     train(attn_implementation="flash_attention_2")
[rank54]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank54]:     trainer.train(resume_from_checkpoint=True)
[rank54]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank54]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank54]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank54]:     with open(json_path, "r", encoding="utf-8") as f:
[rank54]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank1]: Traceback (most recent call last):
[rank1]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank1]:     train(attn_implementation="flash_attention_2")
[rank1]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank1]:     trainer.train(resume_from_checkpoint=True)
[rank1]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank1]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank1]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank1]:     with open(json_path, "r", encoding="utf-8") as f:
[rank1]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank14]: Traceback (most recent call last):
[rank14]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank14]:     train(attn_implementation="flash_attention_2")
[rank14]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank14]:     trainer.train(resume_from_checkpoint=True)
[rank14]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank14]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank14]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank14]:     with open(json_path, "r", encoding="utf-8") as f:
[rank14]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank51]: Traceback (most recent call last):
[rank51]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank51]:     train(attn_implementation="flash_attention_2")
[rank51]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank51]:     trainer.train(resume_from_checkpoint=True)
[rank51]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank51]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank51]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank51]:     with open(json_path, "r", encoding="utf-8") as f:
[rank51]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank125]: Traceback (most recent call last):
[rank125]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank125]:     train(attn_implementation="flash_attention_2")
[rank125]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank125]:     trainer.train(resume_from_checkpoint=True)
[rank125]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank125]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank125]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank125]:     with open(json_path, "r", encoding="utf-8") as f:
[rank125]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank101]: Traceback (most recent call last):
[rank101]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank101]:     train(attn_implementation="flash_attention_2")
[rank101]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank101]:     trainer.train(resume_from_checkpoint=True)
[rank101]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank101]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank101]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank101]:     with open(json_path, "r", encoding="utf-8") as f:
[rank101]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank22]: Traceback (most recent call last):
[rank22]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank22]:     train(attn_implementation="flash_attention_2")
[rank22]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank22]:     trainer.train(resume_from_checkpoint=True)
[rank22]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank22]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank22]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank22]:     with open(json_path, "r", encoding="utf-8") as f:
[rank22]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank77]: Traceback (most recent call last):
[rank77]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank77]:     train(attn_implementation="flash_attention_2")
[rank77]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank77]:     trainer.train(resume_from_checkpoint=True)
[rank77]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank77]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank77]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank77]:     with open(json_path, "r", encoding="utf-8") as f:
[rank77]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank19]: Traceback (most recent call last):
[rank19]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank19]:     train(attn_implementation="flash_attention_2")
[rank19]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank19]:     trainer.train(resume_from_checkpoint=True)
[rank19]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank19]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank19]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank19]:     with open(json_path, "r", encoding="utf-8") as f:
[rank19]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank97]: Traceback (most recent call last):
[rank97]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank97]:     train(attn_implementation="flash_attention_2")
[rank97]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank97]:     trainer.train(resume_from_checkpoint=True)
[rank97]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank97]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank97]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank97]:     with open(json_path, "r", encoding="utf-8") as f:
[rank97]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank79]: Traceback (most recent call last):
[rank79]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank79]:     train(attn_implementation="flash_attention_2")
[rank79]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank79]:     trainer.train(resume_from_checkpoint=True)
[rank79]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank79]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank79]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank79]:     with open(json_path, "r", encoding="utf-8") as f:
[rank79]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank48]:[W217 19:39:32.167741292 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
[rank70]: Traceback (most recent call last):
[rank70]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank70]:     train(attn_implementation="flash_attention_2")
[rank70]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank70]:     trainer.train(resume_from_checkpoint=True)
[rank70]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank70]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank70]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank70]:     with open(json_path, "r", encoding="utf-8") as f:
[rank70]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank60]: Traceback (most recent call last):
[rank60]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank60]:     train(attn_implementation="flash_attention_2")
[rank60]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank60]:     trainer.train(resume_from_checkpoint=True)
[rank60]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank60]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank60]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank60]:     with open(json_path, "r", encoding="utf-8") as f:
[rank60]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank56]: Traceback (most recent call last):
[rank56]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank56]:     train(attn_implementation="flash_attention_2")
[rank56]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank56]:     trainer.train(resume_from_checkpoint=True)
[rank56]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank56]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank56]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank56]:     with open(json_path, "r", encoding="utf-8") as f:
[rank56]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank108]: Traceback (most recent call last):
[rank108]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank108]:     train(attn_implementation="flash_attention_2")
[rank108]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank108]:     trainer.train(resume_from_checkpoint=True)
[rank108]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank108]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank108]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank108]:     with open(json_path, "r", encoding="utf-8") as f:
[rank108]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank32]: Traceback (most recent call last):
[rank32]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank32]:     train(attn_implementation="flash_attention_2")
[rank32]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank32]:     trainer.train(resume_from_checkpoint=True)
[rank32]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank32]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank32]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank32]:     with open(json_path, "r", encoding="utf-8") as f:
[rank32]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank16]:[W217 19:39:33.171650602 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
[rank86]: Traceback (most recent call last):
[rank86]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank86]:     train(attn_implementation="flash_attention_2")
[rank86]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank86]:     trainer.train(resume_from_checkpoint=True)
[rank86]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank86]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank86]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank86]:     with open(json_path, "r", encoding="utf-8") as f:
[rank86]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank73]: Traceback (most recent call last):
[rank73]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank73]:     train(attn_implementation="flash_attention_2")
[rank73]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank73]:     trainer.train(resume_from_checkpoint=True)
[rank73]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank73]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank73]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank73]:     with open(json_path, "r", encoding="utf-8") as f:
[rank73]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank90]: Traceback (most recent call last):
[rank90]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank90]:     train(attn_implementation="flash_attention_2")
[rank90]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank90]:     trainer.train(resume_from_checkpoint=True)
[rank90]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank90]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank90]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank90]:     with open(json_path, "r", encoding="utf-8") as f:
[rank90]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank68]: Traceback (most recent call last):
[rank68]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank68]:     train(attn_implementation="flash_attention_2")
[rank68]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank68]:     trainer.train(resume_from_checkpoint=True)
[rank68]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank68]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank68]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank68]:     with open(json_path, "r", encoding="utf-8") as f:
[rank68]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank81]: Traceback (most recent call last):
[rank81]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank81]:     train(attn_implementation="flash_attention_2")
[rank81]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank81]:     trainer.train(resume_from_checkpoint=True)
[rank81]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank81]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank81]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank81]:     with open(json_path, "r", encoding="utf-8") as f:
[rank81]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank107]: Traceback (most recent call last):
[rank107]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank107]:     train(attn_implementation="flash_attention_2")
[rank107]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank107]:     trainer.train(resume_from_checkpoint=True)
[rank107]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank107]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank107]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank107]:     with open(json_path, "r", encoding="utf-8") as f:
[rank107]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank12]: Traceback (most recent call last):
[rank12]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank12]:     train(attn_implementation="flash_attention_2")
[rank12]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank12]:     trainer.train(resume_from_checkpoint=True)
[rank12]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank12]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank12]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank12]:     with open(json_path, "r", encoding="utf-8") as f:
[rank12]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank59]: Traceback (most recent call last):
[rank59]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank59]:     train(attn_implementation="flash_attention_2")
[rank59]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank59]:     trainer.train(resume_from_checkpoint=True)
[rank59]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank59]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank59]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank59]:     with open(json_path, "r", encoding="utf-8") as f:
[rank59]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank117]: Traceback (most recent call last):
[rank117]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank117]:     train(attn_implementation="flash_attention_2")
[rank117]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank117]:     trainer.train(resume_from_checkpoint=True)
[rank117]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank117]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank117]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank117]:     with open(json_path, "r", encoding="utf-8") as f:
[rank117]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank93]: Traceback (most recent call last):
[rank93]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank93]:     train(attn_implementation="flash_attention_2")
[rank93]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank93]:     trainer.train(resume_from_checkpoint=True)
[rank93]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank93]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank93]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank93]:     with open(json_path, "r", encoding="utf-8") as f:
[rank93]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank30]: Traceback (most recent call last):
[rank30]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank30]:     train(attn_implementation="flash_attention_2")
[rank30]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank30]:     trainer.train(resume_from_checkpoint=True)
[rank30]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank30]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank30]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank30]:     with open(json_path, "r", encoding="utf-8") as f:
[rank30]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
Using auto half precision backend
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank20]: Traceback (most recent call last):
[rank20]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank20]:     train(attn_implementation="flash_attention_2")
[rank20]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank20]:     trainer.train(resume_from_checkpoint=True)
[rank20]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank20]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank20]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank20]:     with open(json_path, "r", encoding="utf-8") as f:
[rank20]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank0]: Traceback (most recent call last):
[rank0]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank0]:     train(attn_implementation="flash_attention_2")
[rank0]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank0]:     trainer.train(resume_from_checkpoint=True)
[rank0]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank0]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank0]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank0]:     with open(json_path, "r", encoding="utf-8") as f:
[rank0]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank123]: Traceback (most recent call last):
[rank123]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank123]:     train(attn_implementation="flash_attention_2")
[rank123]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank123]:     trainer.train(resume_from_checkpoint=True)
[rank123]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank123]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank123]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank123]:     with open(json_path, "r", encoding="utf-8") as f:
[rank123]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank10]: Traceback (most recent call last):
[rank10]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank10]:     train(attn_implementation="flash_attention_2")
[rank10]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank10]:     trainer.train(resume_from_checkpoint=True)
[rank10]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank10]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank10]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank10]:     with open(json_path, "r", encoding="utf-8") as f:
[rank10]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank127]: Traceback (most recent call last):
[rank127]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank127]:     train(attn_implementation="flash_attention_2")
[rank127]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank127]:     trainer.train(resume_from_checkpoint=True)
[rank127]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank127]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank127]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank127]:     with open(json_path, "r", encoding="utf-8") as f:
[rank127]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank98]: Traceback (most recent call last):
[rank98]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank98]:     train(attn_implementation="flash_attention_2")
[rank98]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank98]:     trainer.train(resume_from_checkpoint=True)
[rank98]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank98]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank98]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank98]:     with open(json_path, "r", encoding="utf-8") as f:
[rank98]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank100]: Traceback (most recent call last):
[rank100]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank100]:     train(attn_implementation="flash_attention_2")
[rank100]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank100]:     trainer.train(resume_from_checkpoint=True)
[rank100]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank100]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank100]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank100]:     with open(json_path, "r", encoding="utf-8") as f:
[rank100]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank92]: Traceback (most recent call last):
[rank92]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank92]:     train(attn_implementation="flash_attention_2")
[rank92]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank92]:     trainer.train(resume_from_checkpoint=True)
[rank92]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank92]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank92]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank92]:     with open(json_path, "r", encoding="utf-8") as f:
[rank92]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank44]: Traceback (most recent call last):
[rank44]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank44]:     train(attn_implementation="flash_attention_2")
[rank44]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank44]:     trainer.train(resume_from_checkpoint=True)
[rank44]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank44]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank44]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank44]:     with open(json_path, "r", encoding="utf-8") as f:
[rank44]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank63]: Traceback (most recent call last):
[rank63]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank63]:     train(attn_implementation="flash_attention_2")
[rank63]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank63]:     trainer.train(resume_from_checkpoint=True)
[rank63]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank63]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank63]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank63]:     with open(json_path, "r", encoding="utf-8") as f:
[rank63]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
Using auto half precision backend
[rank8]: Traceback (most recent call last):
[rank8]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank8]:     train(attn_implementation="flash_attention_2")
[rank8]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank8]:     trainer.train(resume_from_checkpoint=True)
[rank8]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank8]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank8]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank8]:     with open(json_path, "r", encoding="utf-8") as f:
[rank8]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank71]: Traceback (most recent call last):
[rank71]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank71]:     train(attn_implementation="flash_attention_2")
[rank71]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank71]:     trainer.train(resume_from_checkpoint=True)
[rank71]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank71]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank71]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank71]:     with open(json_path, "r", encoding="utf-8") as f:
[rank71]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank37]: Traceback (most recent call last):
[rank37]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank37]:     train(attn_implementation="flash_attention_2")
[rank37]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank37]:     trainer.train(resume_from_checkpoint=True)
[rank37]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank37]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank37]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank37]:     with open(json_path, "r", encoding="utf-8") as f:
[rank37]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
W0217 19:39:36.568000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35630 closing signal SIGTERM
W0217 19:39:36.570000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35631 closing signal SIGTERM
W0217 19:39:36.571000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35632 closing signal SIGTERM
W0217 19:39:36.571000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35633 closing signal SIGTERM
W0217 19:39:36.572000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35634 closing signal SIGTERM
W0217 19:39:36.572000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35636 closing signal SIGTERM
W0217 19:39:36.572000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 35637 closing signal SIGTERM
W0217 19:39:36.575000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187368 closing signal SIGTERM
W0217 19:39:36.578000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187369 closing signal SIGTERM
W0217 19:39:36.578000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187371 closing signal SIGTERM
W0217 19:39:36.579000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187372 closing signal SIGTERM
W0217 19:39:36.580000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187373 closing signal SIGTERM
W0217 19:39:36.580000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187374 closing signal SIGTERM
W0217 19:39:36.581000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187375 closing signal SIGTERM
[rank31]: Traceback (most recent call last):
[rank31]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank31]:     train(attn_implementation="flash_attention_2")
[rank31]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank31]:     trainer.train(resume_from_checkpoint=True)
[rank31]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank31]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank31]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank31]:     with open(json_path, "r", encoding="utf-8") as f:
[rank31]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank91]: Traceback (most recent call last):
[rank91]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank91]:     train(attn_implementation="flash_attention_2")
[rank91]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank91]:     trainer.train(resume_from_checkpoint=True)
[rank91]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank91]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank91]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank91]:     with open(json_path, "r", encoding="utf-8") as f:
[rank91]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank18]: Traceback (most recent call last):
[rank18]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank18]:     train(attn_implementation="flash_attention_2")
[rank18]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank18]:     trainer.train(resume_from_checkpoint=True)
[rank18]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank18]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank18]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank18]:     with open(json_path, "r", encoding="utf-8") as f:
[rank18]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
W0217 19:39:37.481000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403222 closing signal SIGTERM
W0217 19:39:37.484000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403223 closing signal SIGTERM
Using auto half precision backend
W0217 19:39:37.485000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403224 closing signal SIGTERM
W0217 19:39:37.485000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403225 closing signal SIGTERM
W0217 19:39:37.486000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403226 closing signal SIGTERM
W0217 19:39:37.487000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403227 closing signal SIGTERM
W0217 19:39:37.487000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403228 closing signal SIGTERM
[rank40]: Traceback (most recent call last):
[rank40]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank40]:     train(attn_implementation="flash_attention_2")
[rank40]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank40]:     trainer.train(resume_from_checkpoint=True)
[rank40]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank40]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank40]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank40]:     with open(json_path, "r", encoding="utf-8") as f:
[rank40]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:37.825000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38076 closing signal SIGTERM
W0217 19:39:37.827000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38077 closing signal SIGTERM
W0217 19:39:37.827000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38079 closing signal SIGTERM
W0217 19:39:37.828000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38080 closing signal SIGTERM
W0217 19:39:37.828000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38082 closing signal SIGTERM
W0217 19:39:37.828000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38085 closing signal SIGTERM
W0217 19:39:37.829000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 38086 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank118]: Traceback (most recent call last):
[rank118]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank118]:     train(attn_implementation="flash_attention_2")
[rank118]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank118]:     trainer.train(resume_from_checkpoint=True)
[rank118]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank118]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank118]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank118]:     with open(json_path, "r", encoding="utf-8") as f:
[rank118]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank67]: Traceback (most recent call last):
[rank67]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank67]:     train(attn_implementation="flash_attention_2")
[rank67]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank67]:     trainer.train(resume_from_checkpoint=True)
[rank67]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank67]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank67]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank67]:     with open(json_path, "r", encoding="utf-8") as f:
[rank67]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank75]: Traceback (most recent call last):
[rank75]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank75]:     train(attn_implementation="flash_attention_2")
[rank75]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank75]:     trainer.train(resume_from_checkpoint=True)
[rank75]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank75]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank75]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank75]:     with open(json_path, "r", encoding="utf-8") as f:
[rank75]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank126]: Traceback (most recent call last):
[rank126]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank126]:     train(attn_implementation="flash_attention_2")
[rank126]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank126]:     trainer.train(resume_from_checkpoint=True)
[rank126]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank126]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank126]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank126]:     with open(json_path, "r", encoding="utf-8") as f:
[rank126]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank11]: Traceback (most recent call last):
[rank11]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank11]:     train(attn_implementation="flash_attention_2")
[rank11]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank11]:     trainer.train(resume_from_checkpoint=True)
[rank11]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank11]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank11]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank11]:     with open(json_path, "r", encoding="utf-8") as f:
[rank11]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank88]:[W217 19:39:38.383309881 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank87]: Traceback (most recent call last):
[rank87]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank87]:     train(attn_implementation="flash_attention_2")
[rank87]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank87]:     trainer.train(resume_from_checkpoint=True)
[rank87]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank87]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank87]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank87]:     with open(json_path, "r", encoding="utf-8") as f:
[rank87]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank80]: Traceback (most recent call last):
[rank80]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank80]:     train(attn_implementation="flash_attention_2")
[rank80]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank80]:     trainer.train(resume_from_checkpoint=True)
[rank80]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank80]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank80]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank80]:     with open(json_path, "r", encoding="utf-8") as f:
[rank80]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank38]: Traceback (most recent call last):
[rank38]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank38]:     train(attn_implementation="flash_attention_2")
[rank38]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank38]:     trainer.train(resume_from_checkpoint=True)
[rank38]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank38]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank38]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank38]:     with open(json_path, "r", encoding="utf-8") as f:
[rank38]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank33]: Traceback (most recent call last):
[rank33]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank33]:     train(attn_implementation="flash_attention_2")
[rank33]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank33]:     trainer.train(resume_from_checkpoint=True)
[rank33]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank33]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank33]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank33]:     with open(json_path, "r", encoding="utf-8") as f:
[rank33]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:39.422000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45273 closing signal SIGTERM
W0217 19:39:39.425000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45274 closing signal SIGTERM
W0217 19:39:39.426000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45277 closing signal SIGTERM
W0217 19:39:39.426000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45283 closing signal SIGTERM
W0217 19:39:39.426000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45284 closing signal SIGTERM
W0217 19:39:39.427000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45286 closing signal SIGTERM
W0217 19:39:39.427000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45290 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank15]: Traceback (most recent call last):
[rank15]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank15]:     train(attn_implementation="flash_attention_2")
[rank15]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank15]:     trainer.train(resume_from_checkpoint=True)
[rank15]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank15]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank15]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank15]:     with open(json_path, "r", encoding="utf-8") as f:
[rank15]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank61]: Traceback (most recent call last):
[rank61]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank61]:     train(attn_implementation="flash_attention_2")
[rank61]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank61]:     trainer.train(resume_from_checkpoint=True)
[rank61]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank61]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank61]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank61]:     with open(json_path, "r", encoding="utf-8") as f:
[rank61]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank104]:[W217 19:39:39.892680571 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
[rank24]: Traceback (most recent call last):
[rank24]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank24]:     train(attn_implementation="flash_attention_2")
[rank24]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank24]:     trainer.train(resume_from_checkpoint=True)
[rank24]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank24]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank24]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank24]:     with open(json_path, "r", encoding="utf-8") as f:
[rank24]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank112]:[W217 19:39:39.160879259 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank26]: Traceback (most recent call last):
[rank26]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank26]:     train(attn_implementation="flash_attention_2")
[rank26]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank26]:     trainer.train(resume_from_checkpoint=True)
[rank26]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank26]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank26]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank26]:     with open(json_path, "r", encoding="utf-8") as f:
[rank26]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
W0217 19:39:40.492000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46441 closing signal SIGTERM
W0217 19:39:40.495000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46442 closing signal SIGTERM
W0217 19:39:40.496000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46443 closing signal SIGTERM
W0217 19:39:40.497000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46444 closing signal SIGTERM
W0217 19:39:40.497000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46445 closing signal SIGTERM
W0217 19:39:40.497000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46447 closing signal SIGTERM
W0217 19:39:40.498000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46448 closing signal SIGTERM
[rank74]: Traceback (most recent call last):
[rank74]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank74]:     train(attn_implementation="flash_attention_2")
[rank74]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank74]:     trainer.train(resume_from_checkpoint=True)
[rank74]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank74]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank74]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank74]:     with open(json_path, "r", encoding="utf-8") as f:
[rank74]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank36]: Traceback (most recent call last):
[rank36]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank36]:     train(attn_implementation="flash_attention_2")
[rank36]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank36]:     trainer.train(resume_from_checkpoint=True)
[rank36]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank36]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank36]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank36]:     with open(json_path, "r", encoding="utf-8") as f:
[rank36]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank72]:[W217 19:39:41.410101457 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())
W0217 19:39:41.123000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31447 closing signal SIGTERM
W0217 19:39:41.127000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31448 closing signal SIGTERM
W0217 19:39:41.127000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31449 closing signal SIGTERM
W0217 19:39:41.127000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31450 closing signal SIGTERM
W0217 19:39:41.128000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31451 closing signal SIGTERM
W0217 19:39:41.128000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31452 closing signal SIGTERM
W0217 19:39:41.128000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31454 closing signal SIGTERM
W0217 19:39:41.276000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25501 closing signal SIGTERM
W0217 19:39:41.278000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25502 closing signal SIGTERM
W0217 19:39:41.279000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25503 closing signal SIGTERM
W0217 19:39:41.279000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25504 closing signal SIGTERM
W0217 19:39:41.280000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25506 closing signal SIGTERM
W0217 19:39:41.281000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25507 closing signal SIGTERM
W0217 19:39:41.281000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25508 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
W0217 19:39:41.470000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240260 closing signal SIGTERM
W0217 19:39:41.473000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240261 closing signal SIGTERM
W0217 19:39:41.474000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240263 closing signal SIGTERM
W0217 19:39:41.474000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240264 closing signal SIGTERM
W0217 19:39:41.475000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240265 closing signal SIGTERM
W0217 19:39:41.476000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240266 closing signal SIGTERM
W0217 19:39:41.476000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240267 closing signal SIGTERM
Using auto half precision backend
[rank120]: Traceback (most recent call last):
[rank120]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank120]:     train(attn_implementation="flash_attention_2")
[rank120]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank120]:     trainer.train(resume_from_checkpoint=True)
[rank120]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank120]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank120]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank120]:     with open(json_path, "r", encoding="utf-8") as f:
[rank120]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank96]: Traceback (most recent call last):
[rank96]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank96]:     train(attn_implementation="flash_attention_2")
[rank96]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank96]:     trainer.train(resume_from_checkpoint=True)
[rank96]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank96]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank96]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank96]:     with open(json_path, "r", encoding="utf-8") as f:
[rank96]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank111]: Traceback (most recent call last):
[rank111]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank111]:     train(attn_implementation="flash_attention_2")
[rank111]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank111]:     trainer.train(resume_from_checkpoint=True)
[rank111]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank111]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank111]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank111]:     with open(json_path, "r", encoding="utf-8") as f:
[rank111]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank119]: Traceback (most recent call last):
[rank119]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank119]:     train(attn_implementation="flash_attention_2")
[rank119]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank119]:     trainer.train(resume_from_checkpoint=True)
[rank119]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank119]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank119]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank119]:     with open(json_path, "r", encoding="utf-8") as f:
[rank119]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:43.146000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40259 closing signal SIGTERM
W0217 19:39:43.149000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40260 closing signal SIGTERM
W0217 19:39:43.150000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40261 closing signal SIGTERM
W0217 19:39:43.150000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40262 closing signal SIGTERM
W0217 19:39:43.151000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40263 closing signal SIGTERM
W0217 19:39:43.151000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40264 closing signal SIGTERM
W0217 19:39:43.152000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40265 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank66]: Traceback (most recent call last):
[rank66]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank66]:     train(attn_implementation="flash_attention_2")
[rank66]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank66]:     trainer.train(resume_from_checkpoint=True)
[rank66]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank66]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank66]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank66]:     with open(json_path, "r", encoding="utf-8") as f:
[rank66]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:43.393000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829459 closing signal SIGTERM
W0217 19:39:43.396000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829460 closing signal SIGTERM
W0217 19:39:43.397000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829461 closing signal SIGTERM
W0217 19:39:43.397000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829462 closing signal SIGTERM
W0217 19:39:43.398000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829464 closing signal SIGTERM
W0217 19:39:43.398000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829465 closing signal SIGTERM
W0217 19:39:43.398000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829466 closing signal SIGTERM
W0217 19:39:43.517000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44206 closing signal SIGTERM
W0217 19:39:43.519000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44207 closing signal SIGTERM
W0217 19:39:43.520000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44208 closing signal SIGTERM
W0217 19:39:43.520000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44209 closing signal SIGTERM
W0217 19:39:43.521000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44210 closing signal SIGTERM
W0217 19:39:43.521000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44213 closing signal SIGTERM
W0217 19:39:43.522000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44215 closing signal SIGTERM
W0217 19:39:43.955000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42275 closing signal SIGTERM
W0217 19:39:43.958000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42276 closing signal SIGTERM
W0217 19:39:43.958000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42277 closing signal SIGTERM
W0217 19:39:43.959000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42279 closing signal SIGTERM
W0217 19:39:43.959000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42280 closing signal SIGTERM
W0217 19:39:43.960000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42281 closing signal SIGTERM
W0217 19:39:43.960000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42282 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank124]: Traceback (most recent call last):
[rank124]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank124]:     train(attn_implementation="flash_attention_2")
[rank124]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank124]:     trainer.train(resume_from_checkpoint=True)
[rank124]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank124]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank124]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank124]:     with open(json_path, "r", encoding="utf-8") as f:
[rank124]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:44.540000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41598 closing signal SIGTERM
W0217 19:39:44.543000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41600 closing signal SIGTERM
W0217 19:39:44.544000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41601 closing signal SIGTERM
W0217 19:39:44.544000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41602 closing signal SIGTERM
W0217 19:39:44.545000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41603 closing signal SIGTERM
W0217 19:39:44.545000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41604 closing signal SIGTERM
W0217 19:39:44.545000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41605 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
Using auto half precision backend
[rank64]: Traceback (most recent call last):
[rank64]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank64]:     train(attn_implementation="flash_attention_2")
[rank64]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank64]:     trainer.train(resume_from_checkpoint=True)
[rank64]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank64]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank64]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank64]:     with open(json_path, "r", encoding="utf-8") as f:
[rank64]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
[rank9]: Traceback (most recent call last):
[rank9]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank9]:     train(attn_implementation="flash_attention_2")
[rank9]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank9]:     trainer.train(resume_from_checkpoint=True)
[rank9]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank9]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank9]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank9]:     with open(json_path, "r", encoding="utf-8") as f:
[rank9]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:44.932000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502184 closing signal SIGTERM
W0217 19:39:44.935000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502185 closing signal SIGTERM
W0217 19:39:44.935000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502186 closing signal SIGTERM
W0217 19:39:44.937000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502187 closing signal SIGTERM
W0217 19:39:44.937000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502188 closing signal SIGTERM
W0217 19:39:44.938000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502189 closing signal SIGTERM
W0217 19:39:44.938000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502191 closing signal SIGTERM
/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py:1616: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `LLaVATrainer.__init__`. Use `processing_class` instead.
  trainer = LLaVATrainer(
[rank69]: Traceback (most recent call last):
[rank69]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train_mem.py", line 4, in <module>
[rank69]:     train(attn_implementation="flash_attention_2")
[rank69]:   File "/opt/hpcaas/.mounts/fs-036153e63d56f4dc2/home/zhaojiang/interleaved-llava/llava/train/train.py", line 1631, in train
[rank69]:     trainer.train(resume_from_checkpoint=True)
[rank69]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2215, in train
[rank69]:     state = TrainerState.load_from_json(os.path.join(resume_from_checkpoint, TRAINER_STATE_NAME))
[rank69]:   File "/home/zhaojiang/.local/lib/python3.10/site-packages/transformers/trainer_callback.py", line 150, in load_from_json
[rank69]:     with open(json_path, "r", encoding="utf-8") as f:
[rank69]: FileNotFoundError: [Errno 2] No such file or directory: '/fsx_0/user/zhaojiang/models/qwen-vl-gen/checkpoint-9000/trainer_state.json'
W0217 19:39:46.742000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33384 closing signal SIGTERM
W0217 19:39:46.745000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33385 closing signal SIGTERM
W0217 19:39:46.746000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33386 closing signal SIGTERM
W0217 19:39:46.746000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33387 closing signal SIGTERM
W0217 19:39:46.746000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33388 closing signal SIGTERM
W0217 19:39:46.747000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33389 closing signal SIGTERM
W0217 19:39:46.747000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33391 closing signal SIGTERM
E0217 19:39:56.939000 37851 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 2 (pid: 38078) of binary: /usr/bin/python3.10
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
llava/train/train_mem.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-02-17_19:39:37
  host      : h100-st-p548xlarge-271.ar-ai-use2.hpcaas
  rank      : 42 (local_rank: 2)
  exitcode  : 1 (pid: 38078)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
E0217 19:39:57.247000 35449 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 0 (pid: 35629) of binary: /usr/bin/python3.10
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
llava/train/train_mem.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-02-17_19:39:36
  host      : h100-st-p548xlarge-272.ar-ai-use2.hpcaas
  rank      : 48 (local_rank: 0)
  exitcode  : 1 (pid: 35629)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
srun: error: h100-st-p548xlarge-271: task 5: Exited with exit code 1
srun: Terminating StepId=336303.0
slurmstepd: error: *** STEP 336303.0 ON h100-st-p548xlarge-14 CANCELLED AT 2025-02-17T19:39:57 ***
W0217 19:39:57.301000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 1187194 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1187375 closing signal SIGTERM
W0217 19:39:57.301000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.301000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.301000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.303000 2401523 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2403222 closing signal SIGTERM
W0217 19:39:57.303000 2501994 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 2502189 closing signal SIGTERM
W0217 19:39:57.303000 33190 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 33384 closing signal SIGTERM
W0217 19:39:57.301000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.303000 44026 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 44215 closing signal SIGTERM
W0217 19:39:57.302000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.303000 42089 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 42275 closing signal SIGTERM
W0217 19:39:57.303000 24658 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 25503 closing signal SIGTERM
W0217 19:39:57.304000 40069 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40260 closing signal SIGTERM
W0217 19:39:57.301000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.304000 45670 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 46445 closing signal SIGTERM
W0217 19:39:57.302000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.302000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGTERM death signal, shutting down workers
W0217 19:39:57.304000 31269 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 31452 closing signal SIGTERM
W0217 19:39:57.304000 239443 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 240267 closing signal SIGTERM
W0217 19:39:57.304000 1829281 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 1829462 closing signal SIGTERM
W0217 19:39:57.304000 41382 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 41600 closing signal SIGTERM
W0217 19:39:57.304000 45067 .local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 45283 closing signal SIGTERM
srun: error: h100-st-p548xlarge-272: task 6: Terminated
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 1187194 got signal: 15
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 2401523 got signal: 15
srun: error: h100-st-p548xlarge-14: task 0: Exited with exit code 1
srun: error: h100-st-p548xlarge-75: task 2: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 45067 got signal: 15
srun: error: h100-st-p548xlarge-338: task 11: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 31269 got signal: 15
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 24658 got signal: 15
srun: error: h100-st-p548xlarge-273: task 7: Exited with exit code 1
srun: error: h100-st-p548xlarge-275: task 9: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 40069 got signal: 15
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 45670 got signal: 15
srun: error: h100-st-p548xlarge-358: task 14: Exited with exit code 1
srun: error: h100-st-p548xlarge-337: task 10: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 44026 got signal: 15
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 239443 got signal: 15
srun: error: h100-st-p548xlarge-340: task 13: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
srun: error: h100-st-p548xlarge-121: task 4: Exited with exit code 1
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 42089 got signal: 15
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 1829281 got signal: 15
srun: error: h100-st-p548xlarge-339: task 12: Exited with exit code 1
srun: error: h100-st-p548xlarge-96: task 3: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 41382 got signal: 15
srun: error: h100-st-p548xlarge-359: task 15: Exited with exit code 1
Traceback (most recent call last):
  File "/home/zhaojiang/.local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
    return f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main
    run(args)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run
    elastic_launch(
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 260, in launch_agent
    result = agent.run()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 696, in run
    result = self._invoke_run(role)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/api.py", line 856, in _invoke_run
    run_result = self._monitor_workers(self._worker_group)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/metrics/api.py", line 137, in wrapper
    result = f(*args, **kwargs)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/agent/server/local_elastic_agent.py", line 387, in _monitor_workers
    result = self._pcontext.wait(0)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 531, in wait
    return self._poll()
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 861, in _poll
    self.close()  # terminate all running procs
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 572, in close
    self._close(death_sig=death_sig, timeout=timeout)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 909, in _close
    handler.proc.wait(time_to_wait)
  File "/usr/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib/python3.10/subprocess.py", line 1953, in _wait
    time.sleep(delay)
  File "/home/zhaojiang/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 84, in _terminate_process_handler
    raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval)
torch.distributed.elastic.multiprocessing.api.SignalException: Process 2501994 got signal: 15
srun: error: h100-st-p548xlarge-44: task 1: Exited with exit code 1
srun: error: h100-st-p548xlarge-274: task 8: Killed
srun: Force Terminated StepId=336303.0
pretrain.sh: 82: python: not found