SpeechT5 TTS Igbo Yoruba
This model is a fine-tuned version of microsoft/speecht5_tts on the all_tts_v2_processed_with_speaker_embeddings dataset. It achieves the following results on the evaluation set:
- Loss: 0.4111
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 6
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 12
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 18000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.6671 | 0.0526 | 250 | 0.5571 |
0.5767 | 0.1052 | 500 | 0.4814 |
0.5233 | 0.1577 | 750 | 0.4562 |
0.5045 | 0.2103 | 1000 | 0.4461 |
0.4917 | 0.2629 | 1250 | 0.4440 |
0.4908 | 0.3155 | 1500 | 0.4398 |
0.4881 | 0.3680 | 1750 | 0.4346 |
0.4855 | 0.4206 | 2000 | 0.4361 |
0.4785 | 0.4732 | 2250 | 0.4343 |
0.4753 | 0.5258 | 2500 | 0.4310 |
0.4767 | 0.5783 | 2750 | 0.4309 |
0.4707 | 0.6309 | 3000 | 0.4280 |
0.4724 | 0.6835 | 3250 | 0.4278 |
0.4694 | 0.7361 | 3500 | 0.4264 |
0.4674 | 0.7886 | 3750 | 0.4259 |
0.4659 | 0.8412 | 4000 | 0.4263 |
0.4631 | 0.8938 | 4250 | 0.4243 |
0.4644 | 0.9464 | 4500 | 0.4232 |
0.4619 | 0.9989 | 4750 | 0.4221 |
0.4662 | 1.0515 | 5000 | 0.4244 |
0.4602 | 1.1041 | 5250 | 0.4217 |
0.4616 | 1.1567 | 5500 | 0.4211 |
0.461 | 1.2093 | 5750 | 0.4201 |
0.4576 | 1.2618 | 6000 | 0.4212 |
0.4573 | 1.3144 | 6250 | 0.4187 |
0.4598 | 1.3670 | 6500 | 0.4186 |
0.4551 | 1.4196 | 6750 | 0.4200 |
0.4599 | 1.4721 | 7000 | 0.4175 |
0.4576 | 1.5247 | 7250 | 0.4169 |
0.4569 | 1.5773 | 7500 | 0.4180 |
0.4539 | 1.6299 | 7750 | 0.4175 |
0.4552 | 1.6824 | 8000 | 0.4158 |
0.4554 | 1.7350 | 8250 | 0.4163 |
0.451 | 1.7876 | 8500 | 0.4171 |
0.4558 | 1.8402 | 8750 | 0.4163 |
0.4539 | 1.8927 | 9000 | 0.4153 |
0.4537 | 1.9453 | 9250 | 0.4160 |
0.453 | 1.9979 | 9500 | 0.4164 |
0.4539 | 2.0505 | 9750 | 0.4157 |
0.4561 | 2.1030 | 10000 | 0.4143 |
0.4513 | 2.1556 | 10250 | 0.4144 |
0.4525 | 2.2082 | 10500 | 0.4145 |
0.4532 | 2.2608 | 10750 | 0.4149 |
0.4483 | 2.3134 | 11000 | 0.4140 |
0.4496 | 2.3659 | 11250 | 0.4142 |
0.4513 | 2.4185 | 11500 | 0.4131 |
0.4492 | 2.4711 | 11750 | 0.4134 |
0.4504 | 2.5237 | 12000 | 0.4130 |
0.4484 | 2.5762 | 12250 | 0.4131 |
0.4522 | 2.6288 | 12500 | 0.4132 |
0.4467 | 2.6814 | 12750 | 0.4124 |
0.4487 | 2.7340 | 13000 | 0.4125 |
0.4462 | 2.7865 | 13250 | 0.4117 |
0.4459 | 2.8391 | 13500 | 0.4119 |
0.4485 | 2.8917 | 13750 | 0.4121 |
0.4467 | 2.9443 | 14000 | 0.4121 |
0.4495 | 2.9968 | 14250 | 0.4124 |
0.4473 | 3.0494 | 14500 | 0.4111 |
0.4462 | 3.1020 | 14750 | 0.4112 |
0.445 | 3.1546 | 15000 | 0.4119 |
0.4497 | 3.2072 | 15250 | 0.4133 |
0.4488 | 3.2597 | 15500 | 0.4116 |
0.4451 | 3.3123 | 15750 | 0.4115 |
0.4473 | 3.3649 | 16000 | 0.4115 |
0.4416 | 3.4175 | 16250 | 0.4116 |
0.4454 | 3.4700 | 16500 | 0.4106 |
0.4491 | 3.5226 | 16750 | 0.4112 |
0.4502 | 3.5752 | 17000 | 0.4108 |
0.4488 | 3.6278 | 17250 | 0.4111 |
0.4474 | 3.6803 | 17500 | 0.4109 |
0.4478 | 3.7329 | 17750 | 0.4110 |
0.4468 | 3.7855 | 18000 | 0.4111 |
Framework versions
- Transformers 4.48.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 56
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for ccibeekeoc42/speecht5_finetuned_naija_ig_yo_2025-01-20_O2
Base model
microsoft/speecht5_tts