About

This is a basic zero-shot voice conversion model trained with VITS + contentvec

See:

https://github.com/alphacep/vosk-tts/tree/master/vc

https://github.com/quickvc/QuickVC-VoiceConversion

https://github.com/auspicious3000/contentvec

Speaker Similarity

Computed with eval.py with Resemblyzer

Original QuickVC (trained on VCTK)       Average: 0.667 Min: 0.477
New model                                Average: 0.880 Min: 0.712
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support transformers models with pipeline type audio-to-audio

Space using alphacep/vosk-vc-ru 1