Just a few questions about voice cloning, audio samples, etc

by ArthurParkerhouse - opened 2 days ago

2 days ago

Just curious, is there a (maximum) limit to how long the source audio sample can be?

For example, should the source audio sample always be less than 30 seconds or less, or can you go up to a 180+ second audio sample for the prompt?

Do longer audio samples make a difference in output quality when voice cloning?

Can the cloned voice be saved for future Text-To-Speech Generations, or would the audio sample need to be presented every time a text-to-speech request is made?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment