Emotions

#3
by jujutechnology - opened

Hi. I thought this model was supposed to be able to show emotions? I never get emotions, even using the demo text from the model page. Is that only available if I download locally and adjust some settings?

My impression is that you can "prompt" the emotions you want it to mimic by providing the one-shot voice cloning sample.

So if you want a whispering output, you have to record yourself (or find a clip of someone whispering). Then use that clip as the one-shot cloning.

Though in some experimenting locally, I can kind of inject a text prompt in the user e.g.

# Tokenize the text
chat = [
  {"role": "system", "content": "You are a laughing british man." + formatted_text}, # this did not work
  {"role": "user", "content": "Convert the text to speech as if you were a laughing british man:" + formatted_text}, # this kind of works but is unreliable
  {"role": "assistant", "content": "<|SPEECH_GENERATION_START|>"},
]

Here is where I read about it: https://huggingface.co/blog/srinivasbilla/llasa-tts#whisper

Yes shows emotions if the source Audio is in the emotion. There may be a way to do it with control vectors but I'm still looking into that

Sign up or log in to comment