|
--- |
|
license: mit |
|
datasets: |
|
- reach-vb/jenny_tts_dataset |
|
language: |
|
- en |
|
base_model: |
|
- myshell-ai/MeloTTS-English |
|
tags: |
|
- audio |
|
- melotts |
|
--- |
|
# MeloTTS Model Checkpoint |
|
|
|
This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis. |
|
|
|
## Model Details |
|
|
|
- **Model Type**: MeloTTS |
|
- **Language Support**: English (Default) |
|
- **Sampling Rate**: 44.1kHz |
|
- **Mel Channels**: 128 |
|
- **Hidden Channels**: 192 |
|
- **Filter Channels**: 768 |
|
|
|
### Architecture Details |
|
- Inter channels: 192 |
|
- Number of heads: 2 |
|
- Number of layers: 6 |
|
- Flow layers: 3 |
|
- Kernel size: 3 |
|
- Dropout rate: 0.1 |
|
|
|
## Training Dataset |
|
|
|
This model was trained on the [Jenny TTS Dataset](https://huggingface.co/datasets/reach-vb/jenny_tts_dataset), which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training. |
|
|
|
## Model Files |
|
|
|
The repository contains several checkpoint files: |
|
- `DUR_*.pth`: Duration predictor checkpoints |
|
- `G_*.pth`: Generator model checkpoints |
|
- `D_*.pth`: Discriminator model checkpoints |
|
- `config.json`: Model configuration file |
|
|
|
## Usage |
|
|
|
To use this model with MeloTTS: |
|
|
|
```python |
|
from melo.api import TTS |
|
|
|
# Initialize TTS with the model path |
|
tts = TTS(model_path="kadirnar/melotts-model") |
|
|
|
# Generate speech |
|
tts.tts_to_file( |
|
text="Your text here", |
|
speaker="EN-default", |
|
language="EN", |
|
output_path="output.wav" |
|
) |
|
``` |
|
|
|
## Training Details |
|
|
|
The model was trained with the following specifications: |
|
- Batch size: 6 |
|
- Learning rate: 0.0003 |
|
- Beta values: [0.8, 0.99] |
|
- Segment size: 16384 |
|
|
|
## Original Repository |
|
|
|
This model is based on [MeloTTS](https://github.com/myshell-ai/MeloTTS) by MyShell.ai. Visit the original repository for more details about the architecture and implementation. |
|
|
|
## License |
|
|
|
This model follows the same licensing as the original MeloTTS repository (MIT License). |