MUSTAR
/

Rigel-rvc-base-pretrained-model

Model card Files Files and versions Community

Rigel-rvc-base-pretrained-model / README.md

MUSTAR's picture

Update README.md

578213e verified 7 months ago

|

1.27 kB



	## Rigel Pretrained Model

	### Dataset

	* Size: Approximately 2000 hours of speech and vocals.
	* Languages:
	* English: ~800 hours
	* Spanish: ~200 hours
	* French: ~42 hours
	* Russian: ~188 hours
	* Arabic: ~70 hours
	* Japanese: ~140 hours
	* Chinese (Mandarin): ~70 hours
	* Korean: ~80 hours
	* Hindi: ~30 hours
	* Indonesian: ~53 hours
	* Tagalog: ~30 hours
	* Portuguese: ~40 hours
	* German: ~35 hours
	* Singing (all languages): ~190 hours
	* Common language: Unknown amount

	### Sampling Frequency

	* 32kHz (Done)
	* 40kHz (Retraining)

	### Models

	#### Base Model

	* Data: Approximately 2000 hours of low-mid quality data.
	* Steps: 3,890,220
	* Batch: 40-20-2
	* Precision: FP32
	* Sampling Frequency: 32kHz

	#### Fine-Tuned Model

	* Data: 102 hours of high-quality data.
	* Steps: 2,854,856
	* Batch: 20-12-2
	* Precision: FP32
	* Sampling Frequency: 32kHz

	### Hardware Used

	* CPU: AMD EPYC 9754
	* RAM: 256GB
	* GPUs:
	* 1 x H100
	* 4 x L40s
	* 1 x RTX 4080
	* 1 x RTX 4070 Ti

	### Expected Release Date

	* July 22nd


	I hope this is more helpful! Let me know if you'd like any other adjustments or have any other questions.