85
Phi 4 Multimodal
🌖
Interact with a multimodal AI model using text, images, and audio
Interact with a multimodal AI model using text, images, and audio
Generate text transcripts with timestamps from audio or video
Generate subtitles for audio/video
Transcribe audio to text from microphone or file
Transcribe audio from your microphone or upload
Transcribe or translate audio from microphone, file, or YouTube