The inference with MiniCPM-V is flawless, but issues arise during the fine-tuning process. The fine-tuning code used is from https://github.com/OpenBMB/MiniCPM-V/tree/main.

issue

patch_size

AttributeError: 'MiniCPMVConfig' object has no attribute 'patch_size'

scale_resolution

File "path/MiniCPM-V/finetune/dataset.py", line 307, in preprocess
    assert "scale_resolution" in slice_config
AssertionError

RuntimeError

In the resampler.py file, inside the get_abs_pos function, after using sqrt to calculate the square root, the resulting size does not match the original size.

File "/root/.cache/huggingface/modules/transformers_modules/MiniCPM-V/resampler.py", line 154, in forward
    x + pos_embed.unsqueeze(1),
RuntimeError: The size of tensor a (1014) must match the size of tensor b (961) at non-singleton dimension 0

bug fix

patch_size

Add "patch_size": 14, to the config.py file.

scale_resolution

Add "scale_resolution": 448, to the config.py file.

openbmb
/

MiniCPM-V

fine-tuning issues and bug fix

issue

patch_size

scale_resolution

RuntimeError

bug fix

patch_size

scale_resolution

RuntimeError