google
/

matcha-base

Visual Question Answering

image-text-to-text

Model card Files Files and versions Community

nielsr HF staff commited on Jul 22, 2023

Commit

2c884b7

•

1 Parent(s): b1c5087

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -30,7 +30,24 @@ The abstract of the paper states that:
 # Using the model
-## Converting from T5x to huggingface
 You can use the [`convert_pix2struct_checkpoint_to_pytorch.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py) script as follows:
 ```bash

 # Using the model
+```python
+from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
+import requests
+from PIL import Image
+processor = Pix2StructProcessor.from_pretrained('google/matcha-base')
+model = Pix2StructForConditionalGeneration.from_pretrained('google/matcha-base')
+url = "https://raw.githubusercontent.com/vis-nlp/ChartQA/main/ChartQA%20Dataset/val/png/20294671002019.png"
+image = Image.open(requests.get(url, stream=True).raw)
+inputs = processor(images=image, text="Is the sum of all 4 places greater than Laos?", return_tensors="pt")
+predictions = model.generate(**inputs, max_new_tokens=512)
+print(processor.decode(predictions[0], skip_special_tokens=True))
+>>> No
+```
+# Converting from T5x to huggingface
 You can use the [`convert_pix2struct_checkpoint_to_pytorch.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py) script as follows:
 ```bash