The vanilla VALL E train on WenetSpeech4TTS using Amphion tooltik.
The entire training process follows its training code, except that the text-to-phoneme feature step is slightly different.
Checkpoints
- base_model.bin : VALL-E trained with the WenetSpeech4TTS Basic subset
- 38sft_model.bin : VALL-E Basic fine-tuning with the WenetSpeech4TTS Standard subset
- 4sft_model.bin : VALL-E Standard fine-tuning with the WenetSpeech4TTS Premium subset
usage
Inference code and more details : ISCSLP2024_CoVoC_baseline.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.