srinivasbilla/llasa-3b-tts

1 day ago

Hi. I thought this model was supposed to be able to show emotions? I never get emotions, even using the demo text from the model page. Is that only available if I download locally and adjust some settings?

ubergarm

about 8 hours ago

My impression is that you can "prompt" the emotions you want it to mimic by providing the one-shot voice cloning sample.

So if you want a whispering output, you have to record yourself (or find a clip of someone whispering). Then use that clip as the one-shot cloning.

Though in some experimenting locally, I can kind of inject a text prompt in the user e.g.

# Tokenize the text
chat = [
  {"role": "system", "content": "You are a laughing british man." + formatted_text}, # this did not work
  {"role": "user", "content": "Convert the text to speech as if you were a laughing british man:" + formatted_text}, # this kind of works but is unreliable
  {"role": "assistant", "content": "<|SPEECH_GENERATION_START|>"},
]

Here is where I read about it: https://huggingface.co/blog/srinivasbilla/llasa-tts#whisper

srinivasbilla

Owner about 4 hours ago

Yes shows emotions if the source Audio is in the emotion. There may be a way to do it with control vectors but I'm still looking into that

Spaces:

srinivasbilla
/

llasa-3b-tts

Running on Zero

Emotions