Blip Image Captioning Base BF16
This model is a quantized version of the Salesforce/blip-image-captioning-base, an image-to-text model. From a memory footprint of 989 MBs -> 494 MBs by quantizing the percision of float32 to bfloat 16, reducing the model's memory size by 50 percent.
Example
a cat sitting on top of a purple and red striped carpet |
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import BlipForConditionalGeneration, BlipProcessor
import requests
from PIL import Image
model = BlipForConditionalGeneration.from_pretrained("gospacedev/blip-image-captioning-base-bf16")
processor = BlipProcessor.from_pretrained("gospacedev/blip-image-captioning-base-bf16")
# Load sample image
image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
# Generate output
inputs = processor(image, return_tensors="pt")
output = model.generate(**inputs)
result = processor.decode(out[0], skip_special_tokens=True)
print(results)
Model Details
- Developed by: Grantley Cullar
- Model type: Image-to-Text
- Language(s) (NLP): English
- License: MIT License
- Downloads last month
- 103
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.