Just a few questions about voice cloning, audio samples, etc
#2
by
ArthurParkerhouse
- opened
Just curious, is there a (maximum) limit to how long the source audio sample can be?
For example, should the source audio sample always be less than 30 seconds or less, or can you go up to a 180+ second audio sample for the prompt?
Do longer audio samples make a difference in output quality when voice cloning?
Can the cloned voice be saved for future Text-To-Speech Generations, or would the audio sample need to be presented every time a text-to-speech request is made?