melotts-jenny / README.md
kadirnar's picture
Update README.md
88630bd verified
---
license: mit
datasets:
- reach-vb/jenny_tts_dataset
language:
- en
base_model:
- myshell-ai/MeloTTS-English
tags:
- audio
- melotts
---
# MeloTTS Model Checkpoint
This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis.
## Model Details
- **Model Type**: MeloTTS
- **Language Support**: English (Default)
- **Sampling Rate**: 44.1kHz
- **Mel Channels**: 128
- **Hidden Channels**: 192
- **Filter Channels**: 768
### Architecture Details
- Inter channels: 192
- Number of heads: 2
- Number of layers: 6
- Flow layers: 3
- Kernel size: 3
- Dropout rate: 0.1
## Training Dataset
This model was trained on the [Jenny TTS Dataset](https://huggingface.co/datasets/reach-vb/jenny_tts_dataset), which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training.
## Model Files
The repository contains several checkpoint files:
- `DUR_*.pth`: Duration predictor checkpoints
- `G_*.pth`: Generator model checkpoints
- `D_*.pth`: Discriminator model checkpoints
- `config.json`: Model configuration file
## Usage
To use this model with MeloTTS:
```python
from melo.api import TTS
# Initialize TTS with the model path
tts = TTS(model_path="kadirnar/melotts-model")
# Generate speech
tts.tts_to_file(
text="Your text here",
speaker="EN-default",
language="EN",
output_path="output.wav"
)
```
## Training Details
The model was trained with the following specifications:
- Batch size: 6
- Learning rate: 0.0003
- Beta values: [0.8, 0.99]
- Segment size: 16384
## Original Repository
This model is based on [MeloTTS](https://github.com/myshell-ai/MeloTTS) by MyShell.ai. Visit the original repository for more details about the architecture and implementation.
## License
This model follows the same licensing as the original MeloTTS repository (MIT License).