Unique3D
Create a 1M faces 3D colored model from an image!
Create a 1M faces 3D colored model from an image!
Try PaliGemma on document understanding tasks
Generate audio from text prompts
Annotate and describe images with text prompts
Fastest high-quality video diffusion model.
Generate an edited video from a prompt
Video upscaler/restorer
Generate annotated video with object detection
Generate images from text or captions
Generate summaries from YouTube videos or uploaded videos
Chat about images with AI
Launch a language processing workflow
Enhance and upscale images with controlnet guidance
In-browser speech recognition w/ word-level timestamps
High-fidelity Virtual Try-on
Video-to-Audio Generation with Hidden Alignment
Multimodal Image-to-Video
Generate transcript from audio input
Generate images from text prompts
Aesthetically Controllable Text-Driven Stylization w/o Train
Generate lifelike audio-driven portrait animations from images and audio
Try on clothes virtually with images
Engage in multi-modal conversations with images and videos
Generate images by combining a foreground with a custom background
Virtual try-on for clothes on a person
Text-to-Video
Generate text from images and videos
Process audio to text and generate AI response
Convert image text to markdown format
Remove background from ID photos
Generate text by combining an image and a question
Create a video from an image with camera motion
Analyse any image with Llama3.2
Erase or change parts of images using masks
Convert PDFs to page images for dataset creation
Generate retrieval queries from document images
Query an image index to get answers
Transcribe or translate audio and YouTube videos
Generate high-quality music from text descriptions
Generate detailed script for podcast or lecture from text input
Ultra-high resolution image synthesis
Text-to-Video
Generate and edit audio from text prompts
VLMEvalKit Evaluation Results Collection
Generate personalized research profiles and chat with Arxiv Copilot
Interpret and execute code with responses
High-fidelity Virtual Try-on
Upload an image and ask questions about it
Visual Retrieval with ColPali and Vespa
Using RAG LLM to assist your academic writing
Generate images with virtual try-on or pose transfer