What is the prompt format for VQA?

#13

by dnhkng - opened Jun 25, 2024

Discussion

dnhkng

Jun 25, 2024

•

edited Jun 25, 2024

The VQA results look good, but I don't see the correct prompt to use this feature.

I would assume there is a prefix and/or suffix to the question, something like
"<-VQA> What color is the car? /<-/VQA>"

but I don't see the prompt format. Could you please advise on how to utilize this important feature?

haipingwu

Microsoft org Jun 25, 2024

hi, vqa prompt is just '{question}', no prefix needed. will update the readme

iRanadheer

Jun 30, 2024

@haipingwu can you update the README file?

I was trying something like this

res = run_example('{question}', system_instruction + '\n' + prompt)

running into the following error:
IndexError: index out of range in self

I'm assuming the output format is wrong. If possible, can you update the README file with code snippet for Q&A?

Thanks in advance

catyung

Jul 4, 2024

Hi @iRanadheer ,

If you are handling VQA, I think you need to use the fine-tuned version from huggingface team :
https://huggingface.co/HuggingFaceM4/Florence-2-DocVQA

The task prompt should be :

You can also refer to their blog post :
https://huggingface.co/blog/finetune-florence2

lucasjin

Jul 9, 2024

Hi, how does the model finetune? I finetune got some error,
modeling_florence2.py", line 2746, in forward
[rank1]: outputs = self.language_model(
[rank1]: File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/florence2/modeling_florence2.py", line 2159, in forward
[rank1]: lm_logits = self.lm_head(outputs[0])
[rank1]: File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py", line 116, in forward
[rank1]: return F.linear(input, self.weight, self.bias)
[rank1]: RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16BF, lda, b, CUDA_R_16BF, ldb, &fbeta, c, CUDA_R_16BF, ldc, compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
0%|

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment