Phi 4 Multimodal
Interact with a multimodal AI model using text, images, and audio
Interact with a multimodal AI model using text, images, and audio
Generate edited images using text prompts and styles
Generate depth maps from your images
Large Language Diffusion Models
Wan: Open and Advanced Large-Scale Video Generative Models
Compare latest VAE's
PDF to Structured Data powered by Google DeepMind Gemini 2.0
Break the language barrier
Wan: Open and Advanced Large-Scale Video Generative Models
The ultimate guide to training LLM on large GPU Clusters
Blazingly Fast and Embarrassingly Simple Song Generation
Generate images from text prompts
Generate text responses based on user input
Upload images to try on clothes virtually
Generate project deadlines
Execute custom code from environment variables
Scalable and Versatile 3D Generation from images
Generate depth maps from your images
Run code from environment variable
Select benchmarks and languages for text embeddings evaluation
Large Language Diffusion Models
MidJour | A RealVisXL_Turbo | IRL HI-Res Images Gen
Text-to-3D and Image-to-3D Generation
Generate Podcast using Kokoro-TTS!
Create your own AI comic with a single prompt
Interact with a multimodal AI model using text, images, and audio
A text-to-speech model powered by SparkAudio and Mobvoi.
PDF to Structured Data powered by Google DeepMind Gemini 2.0
Track, rank and evaluate open LLMs and chatbots
Generate edited images using text prompts and styles
Free Reverse Image Search
Edit and enhance images with custom color and edge modifications