LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Abstract
This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved. Experimental results show that the LibriTTS-R ground-truth samples showed significantly improved sound quality compared to those in LibriTTS. In addition, neural end-to-end TTS trained with LibriTTS-R achieved speech naturalness on par with that of the ground-truth samples. The corpus is freely available for download from http://www.openslr.org/141/.
Community
@librarian-bot recommend
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks (2024)
- IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS (2024)
- Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems (2024)
- Text-To-Speech Synthesis In The Wild (2024)
- Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice Conversion (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 7
Browse 7 datasets citing this paperSpaces citing this paper 1
Collections including this paper 0
No Collection including this paper