This model is a finetuned whisper-large-v3-turbo model with 1M audio samples from the dataset mitermix/audiosnippets