Text-to-Speech
speechbrain
English
TTS
speech-synthesis
Tacotron2
Mirco commited on
Commit
f7400ef
1 Parent(s): 56bef66

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -1
README.md CHANGED
@@ -19,8 +19,79 @@ metrics:
19
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
20
  <br/><br/>
21
 
22
- # Work-in-Progress
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
  ### Limitations
26
  The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
 
19
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
20
  <br/><br/>
21
 
 
22
 
23
+ # Text-to-Speech (TTS) with Tacotron2 trained on LJSpeech
24
+
25
+ This repository provides all the necessary tools for Text-to-Speech (TTS) with SpeechBrain using a [Tacotron2](https://arxiv.org/abs/1712.05884) pretrained on [LJSpeech](https://keithito.com/LJ-Speech-Dataset/).
26
+
27
+ The pre-trained model takes in input a short text and produces a spectrogram in output. One can get the final waveform by applying a vocoder (e.g., HiFIGAN) on top of the generated spectrogram.
28
+
29
+
30
+ ## Install SpeechBrain
31
+
32
+ First of all, currently you need to install SpeechBrain from the source:
33
+
34
+ 1. Clone SpeechBrain:
35
+
36
+ ```bash
37
+ git clone https://github.com/speechbrain/speechbrain/
38
+ ```
39
+
40
+ 2. Install it:
41
+
42
+ ```
43
+ cd speechbrain
44
+ pip install -r requirements.txt
45
+ pip install -e .
46
+ ```
47
+
48
+ Please notice that we encourage you to read our tutorials and learn more about
49
+ [SpeechBrain](https://speechbrain.github.io).
50
+
51
+ ### Perform Text-to-Speech (TTS)
52
+
53
+ ```
54
+ from speechbrain.pretrained import Tacotron2
55
+ tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
56
+ mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb")
57
+ ```
58
+
59
+ If you want to generate multiple sentences in one-shot, you can do in this way:
60
+
61
+ ```
62
+ from speechbrain.pretrained import Tacotron2
63
+ tacotron2 = Tacotron2.from_hparams(source="speechbrain/TTS_Tacotron2", savedir="tmpdir")
64
+ items = [
65
+ "A quick brown fox jumped over the lazy dog",
66
+ "How much wood would a woodchuck chuck?",
67
+ "Never odd or even"
68
+ ]
69
+ mel_outputs, mel_lengths, alignments = tacotron2.encode_batch(items)
70
+
71
+ ```
72
+
73
+ ### Inference on GPU
74
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
75
+
76
+ ### Training
77
+ The model was trained with SpeechBrain.
78
+ To train it from scratch follow these steps:
79
+ 1. Clone SpeechBrain:
80
+ ```bash
81
+ git clone https://github.com/speechbrain/speechbrain/
82
+ ```
83
+ 2. Install it:
84
+ ```bash
85
+ cd speechbrain
86
+ pip install -r requirements.txt
87
+ pip install -e .
88
+ ```
89
+ 3. Run Training:
90
+ ```bash
91
+ cd https://github.com/speechbrain/speechbrain/tree/develop/recipes/LJSpeech/TTS/tacotron2
92
+ python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
93
+ ```
94
+ You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1PKju-_Nal3DQqd-n0PsaHK-bVIOlbf26?usp=sharing).
95
 
96
  ### Limitations
97
  The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.