MerlinLi
's Collections
text-to-speech
updated
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
•
2404.14700
•
Published
•
29
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Paper
•
2306.15687
•
Published
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
Paper
•
2403.03100
•
Published
•
34
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through
Direct Preference Optimization
Paper
•
2404.09956
•
Published
•
11
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech
Prompts
Paper
•
2307.07218
•
Published
•
26
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
Bias
Paper
•
2306.03509
•
Published
•
4
parler-tts/dac_44khZ_8kbps
Updated
•
408
•
17
parler-tts/parler_tts_mini_v0.1
Text-to-Speech
•
Updated
•
19k
•
347
Wenetspeech4TTS/WenetSpeech4TTS
Updated
•
527
•
69
liuhuadai/AudioLCM
Text-to-Audio
•
Updated
•
4
•
7
kyutai/mimi
Feature Extraction
•
Updated
•
6.03M
•
99