5
Multimodal RAG Pejman
π·
Extract and answer questions from PDFs using images
Extract and answer questions from PDFs using images
Generate images from text prompts
Process video to detect specified objects
Engage in conversations and ask questions using text or images
Analyze images to caption, detect objects, extract text, and ground phrases
Generate answers to questions about images