transformers torch soundfile einops accelerate safetensors SpeechRecognition