File size: 5,959 Bytes
9bd6000 837fc06 9bd6000 837fc06 4ec1dfc 837fc06 4ec1dfc 9bd6000 837fc06 e29e0a3 acdcfcc 837fc06 c69c1d0 acdcfcc 837fc06 acdcfcc 837fc06 c629693 0c46e14 5f12f84 acdcfcc 837fc06 fe28236 837fc06 acdcfcc 837fc06 acdcfcc 9bd6000 8a640c0 837fc06 8a640c0 5ae1d45 8a640c0 837fc06 8a640c0 c69c1d0 f1c84bf f2a3761 9bd6000 837fc06 acdcfcc 9bd6000 e29e0a3 9bd6000 837fc06 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
---
license: apache-2.0
language:
- en
tags:
- ai
- rvc
- vc
- voice-cloning
- applio
- titan
- pretrained
datasets:
- blaise-tk/TITAN-Medium
pipeline_tag: audio-to-audio
---
# TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training
## Overview
TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort.
## Model Details
### Titan-Medium
- Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks.
- Iterations: 1010588 Steps and 467 Epochs
- Sampling rate: 40k, 32k (still training)
- Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium).
#### Samples
*Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.*
<table style="width:100%; text-align:center;">
<tr>
<th>Titan-Medium</th>
<th>Ov2</th>
<th>Ov2.1</th>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
</tr>
<tr>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
<td>
<audio controls>
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav">
Your browser does not support the audio element.
</audio>
</td>
</tr>
</table>
### Titan-Large
- Details forthcoming...
## Collaborators
We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN.
- Mustar
- SimplCup
- UnitedShoes
## Beta Testers
We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN.
- SimplCup
- Leo_Frixi
- Light
- SCRFilms
- Ryanz
- Litsa_the_dancer
## Citation
Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository:
```
@article{titan,
title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training},
author={Blaise},
journal={Hugging Face},
year={2024},
publisher={Blaise},
url={https://huggingface.co/blaise-tk/TITAN/}
}
```
|