Transformers
Inference Endpoints
bangla_tts_female / README.md
saiful9379's picture
add colab link
8e75164
---
license: mit
---
# Bangla TTS
The Bangla TTS was training mono(Female) speaker using Vit tts model. The paper is [ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer](https://arxiv.org/pdf/2305.12708.pdf)
we used the coqui-ai🐸-a toolkit for Bangla Text-to-Speech training as well as inference.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ea_BVSinWFy_9W2AH7NI55Ur0XO4Tr-a?usp=sharing)
# Contributions
* Collect various Bangla datasets from the internet some data are collected from Mozilla common voice dataset and train the model.
* we’ve developed the Bangla Vits TTS(text to speech) system that we trained and used for reading various Bangla \
text with the highest performing State of the Art(SOTA) Bangla neural voice.
# Dataset
The Bangla Text-to-Speech (TTS) Team at IIT Madras has curated a Bangla Speech corpus, which has been meticulously processed for research purposes. The dataset has been downsampled to 22050 and reformatted from the original IITM annotation style to the LJSpeech format. This refined dataset, tailored for Bangla TTS, is accompanied by the weight files of the best-trained models.
Researchers are encouraged to cite the corresponding paper, available at [Paper Link](https://aclanthology.org/2020.lrec-1.789.pdf), when utilizing this dataset in their research endeavors. The provided dataset and model weights contribute to the advancement of Bangla TTS research and serve as a valuable resource for further investigations in the field.
[Dataset Link](https://www.kaggle.com/datasets/mobassir/comprehensive-bangla-tts)
# Evaluation
Mean Opinion Score(MOS) : 4.10
[MOS Calculation method](https://waywithwords.net/landing/mean-opinion-score-ratings-2/)
# Inference
For testing please check the end point integration [Github](https://github.com/saiful9379/Bangla_TTS)
# References :
1. https://aclanthology.org/2020.lrec-1.789.pdf
2. https://arxiv.org/pdf/2106.06103.pdf
3. https://arxiv.org/abs/2005.11129
4. https://aclanthology.org/2020.emnlp-main.207.pdf
5. https://github.com/mobassir94