Text-to-Speech
Fairseq
Spanish
audio

tts_transformer-es-css10

Transformer text-to-speech model from fairseq S^2 (paper/code):

  • Spanish
  • Single-speaker male voice
  • Trained on CSS10

Usage

Dependencies

pip install fairseq sentencepiece
from fairseq.checkpoint_utils import load_model_ensemble_and_task_from_hf_hub
from fairseq.models.text_to_speech.hub_interface import TTSHubInterface
import IPython.display as ipd


models, cfg, task = load_model_ensemble_and_task_from_hf_hub(
    "facebook/tts_transformer-es-css10",
    arg_overrides={"vocoder": "hifigan", "fp16": False}
)
model = models[0]
TTSHubInterface.update_cfg_with_data_cfg(cfg, task.data_cfg)
generator = task.build_generator([model], cfg)

text = "Hola, esta es una prueba."

sample = TTSHubInterface.get_model_input(task, text)
wav, rate = TTSHubInterface.get_prediction(task, model, generator, sample)

ipd.Audio(wav, rate=rate)

See also fairseq S^2 example.

Citation

@inproceedings{wang-etal-2021-fairseq,
    title = "fairseq S{\^{}}2: A Scalable and Integrable Speech Synthesis Toolkit",
    author = "Wang, Changhan  and
      Hsu, Wei-Ning  and
      Adi, Yossi  and
      Polyak, Adam  and
      Lee, Ann  and
      Chen, Peng-Jen  and
      Gu, Jiatao  and
      Pino, Juan",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-demo.17",
    doi = "10.18653/v1/2021.emnlp-demo.17",
    pages = "143--152",
}
Downloads last month
59
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using facebook/tts_transformer-es-css10 18

Collection including facebook/tts_transformer-es-css10