File size: 2,004 Bytes
88630bd
 
 
 
 
 
 
 
 
 
 
 
7557714
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88630bd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: mit
datasets:
- reach-vb/jenny_tts_dataset
language:
- en
base_model:
- myshell-ai/MeloTTS-English
tags:
- audio
- melotts
---
# MeloTTS Model Checkpoint

This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis.

## Model Details

- **Model Type**: MeloTTS
- **Language Support**: English (Default)
- **Sampling Rate**: 44.1kHz
- **Mel Channels**: 128
- **Hidden Channels**: 192
- **Filter Channels**: 768

### Architecture Details
- Inter channels: 192
- Number of heads: 2
- Number of layers: 6
- Flow layers: 3
- Kernel size: 3
- Dropout rate: 0.1

## Training Dataset

This model was trained on the [Jenny TTS Dataset](https://huggingface.co/datasets/reach-vb/jenny_tts_dataset), which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training.

## Model Files

The repository contains several checkpoint files:
- `DUR_*.pth`: Duration predictor checkpoints
- `G_*.pth`: Generator model checkpoints
- `D_*.pth`: Discriminator model checkpoints
- `config.json`: Model configuration file

## Usage

To use this model with MeloTTS:

```python
from melo.api import TTS

# Initialize TTS with the model path
tts = TTS(model_path="kadirnar/melotts-model")

# Generate speech
tts.tts_to_file(
    text="Your text here",
    speaker="EN-default",
    language="EN",
    output_path="output.wav"
)
```

## Training Details

The model was trained with the following specifications:
- Batch size: 6
- Learning rate: 0.0003
- Beta values: [0.8, 0.99]
- Segment size: 16384

## Original Repository

This model is based on [MeloTTS](https://github.com/myshell-ai/MeloTTS) by MyShell.ai. Visit the original repository for more details about the architecture and implementation.

## License

This model follows the same licensing as the original MeloTTS repository (MIT License).