Text-to-Speech
mms
vits

question on language coverage

#4
by rjrobben - opened

I am wondering why for TTS why there's coverage for less popular language (like hakka in chinese) than much more popular language (like mandarin/cantonese in chinese).

Sounds unintuitive to me as it's much harder to get training data for less popular language.

Thanks!

Is it related to the tokenization of the language, i can see in hak, the vocab.txt is very minimal.

Hi @rjrobben

Could you elaborate a bit more what you mean by there's coverage for less popular language (like hakka in chinese) than much more popular language (like mandarin/cantonese in chinese)..
I could not find anything in the model card or in the paper about this.

Hi @ydshieh , thanks for the reply.

If you look at https://dl.fbaipublicfiles.com/mms/misc/language_coverage_mms.html

And search "hak", you can see there is TTS support for Hakka language in Chinese.

But if you search “mandarin” or “yue”, you can see they have no TTS support.

If you check most spoken languages list, you can see yue and mandarin are much more popular than hak:
https://en.m.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers

Thanks a lot! Indeed!

@vineelpratap Do you know why? I see you are the author of many commits in this repository so think you know the best.

Sign up or log in to comment