Text-to-Speech
German
f5_tts
speech
F5-TTS

German Voice Cloning TTS Model using F5-TTS Architecture

A German Text-to-Speech system capable of cloning voices from a few seconds of reference audio, built on the F5-TTS architecture.

Model Details

Key Features & Capabilities

  • Generates natural-sounding German speech from text
  • Clones voices using minimal reference audio (few seconds)
  • Suitable for audiobooks, voice assistants, and accessibility applications

Technical Specifications

Download checkpoints from the directories F5TTS_Base (vocos) or F5TTS_Base_bigvgan (bigvgan).

  • Datasets: Common Voice (Mozilla) and Emilia_DE
  • Process: Fine-tuned checkpoints of base F5-TTS model
  • Trained on Hardware: 8x NVIDIA H100

Contact

Acknowledgements

The authors acknowledge the financial support by the German Federal Ministry for Education and Research (BMBF) through the project «KI-Servicezentrum Berlin Brandenburg» (01IS22092).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-to-speech models for f5_tts library.

Model tree for aihpi/F5-TTS-German

Base model

SWivid/F5-TTS
Finetuned
(26)
this model

Datasets used to train aihpi/F5-TTS-German