Failed to run on MacBook: requiring flash_attn
Seems like the same problem as: https://huggingface.co/microsoft/phi-1_5/discussions/72
I'm trying the provided workaround with modeling_florence2.py (still downloading the model it didn't crash so far):
import os
from unittest.mock import patch
import requests
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor
from transformers.dynamic_module_utils import get_imports
def fixed_get_imports(filename: str | os.PathLike) -> list[str]:
"""Work around for https://huggingface.co/microsoft/phi-1_5/discussions/72."""
if not str(filename).endswith("/modeling_florence2.py"):
return get_imports(filename)
imports = get_imports(filename)
imports.remove("flash_attn")
return imports
with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True)
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)
def run_example(prompt):
inputs = processor(text=prompt, images=image, return_tensors="pt")
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
num_beams=3,
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task=prompt, image_size=(image.width, image.height))
print(parsed_answer)
prompt = "<MORE_DETAILED_CAPTION>"
run_example(prompt)
Edit: With this workaround, it works on my MacBook!
I was trying to run on Windows and wondering why the patch wasn't working, until I realized the forward slash wouldn't match a windows file path.
For a more general solution that should work on any platform, replace this line:
if not str(filename).endswith("/modeling_florence2.py"):
with
if os.path.basename(filename) != "modeling_florence2.py":
Thank you for this patch, it works well on Mac.
I've duplicated the Zero-GPU space for the large model, changed to the base-ft model, applied your patch and it works both locally on Mac as well as on a CPU based space
https://huggingface.co/spaces/Norod78/Florence-2-base-ft
Cheers!
Hi, I tried using your patch but I keep running into this error
TypeError: Object of type Florence2LanguageConfig is not JSON serializable
on line
model = AutoModelForCausalLM.from_pretrained("microsoft/Florence-2-large-ft", trust_remote_code=True).
Is there something I'm missing?
I'm also on mac and I have never seen this error before. Try making sure your transformers is recent, your numpy is on 1.x and etc
Here is my pip dump from the environment this worked on, Python 3.10.13 (Miniconda) this env is for general purpose and has lots of stuff in it, but perhaps you could find a version difference in one of the major frameworks
https://pastebin.com/5rgmUtwL
Note that this trick also works when choosing "MPS" (Apple Silicon) as your torch backend.
https://huggingface.co/spaces/Norod78/Florence-2-base-ft/blob/main/app.py