Hubert-common_voice_JSUT-ja-demo-kana

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5729
  • Wer: 0.9998
  • Cer: 0.3126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
No log 0.1936 100 41.9560 1.5294 6.1696
No log 0.3872 200 41.4657 1.4235 5.9722
No log 0.5808 300 40.2769 1.1848 3.7327
No log 0.7744 400 36.3010 1.0 0.9963
31.4419 0.9681 500 24.5426 1.0 0.9991
31.4419 1.1607 600 18.8642 1.0 0.9991
31.4419 1.3543 700 17.6651 1.0 0.9991
31.4419 1.5479 800 17.2007 1.0 0.9992
31.4419 1.7415 900 16.7617 1.0 0.9991
14.8315 1.9351 1000 16.2895 1.0 0.9991
14.8315 2.1278 1100 15.7877 1.0 0.9991
14.8315 2.3214 1200 15.2488 1.0 0.9991
14.8315 2.5150 1300 14.6680 1.0 0.9991
14.8315 2.7086 1400 14.0637 1.0 0.9991
12.4363 2.9022 1500 13.4217 1.0 0.9991
12.4363 3.0949 1600 12.7374 1.0 0.9991
12.4363 3.2885 1700 12.0319 1.0 0.9991
12.4363 3.4821 1800 11.2982 1.0 0.9991
12.4363 3.6757 1900 10.5580 1.0 0.9992
9.8267 3.8693 2000 9.8129 1.0 0.9991
9.8267 4.0620 2100 9.0640 1.0 0.9991
9.8267 4.2556 2200 8.3376 1.0 0.9992
9.8267 4.4492 2300 7.6287 1.0 0.9991
9.8267 4.6428 2400 6.9678 1.0 0.9991
6.9778 4.8364 2500 6.3635 1.0 0.9992
6.9778 5.0290 2600 5.8258 1.0 0.9991
6.9778 5.2227 2700 5.3677 1.0 0.9991
6.9778 5.4163 2800 4.9888 1.0 0.9991
6.9778 5.6099 2900 4.6956 1.0 0.9991
4.8731 5.8035 3000 4.4788 1.0 0.9991
4.8731 5.9971 3100 4.3287 1.0 0.9991
4.8731 6.1897 3200 4.2057 1.0 0.9991
4.8731 6.3833 3300 4.1448 1.0 0.9991
4.8731 6.5770 3400 4.1095 1.0 0.9991
4.1216 6.7706 3500 4.0858 1.0 0.9991
4.1216 6.9642 3600 4.0725 1.0 0.9991
4.1216 7.1568 3700 4.0648 1.0 0.9991
4.1216 7.3504 3800 4.0578 1.0 0.9991
4.1216 7.5440 3900 4.0494 1.0 0.9991
4.0264 7.7377 4000 4.0367 1.0 0.9991
4.0264 7.9313 4100 4.0276 1.0 0.9991
4.0264 8.1239 4200 4.0121 1.0 0.9991
4.0264 8.3175 4300 3.9720 1.0 0.9991
4.0264 8.5111 4400 3.9031 1.0 0.9991
3.937 8.7047 4500 3.8091 1.0 0.9991
3.937 8.8984 4600 3.6690 1.0 0.9991
3.937 9.0910 4700 3.4759 1.0 0.9991
3.937 9.2846 4800 3.2108 1.0 0.9987
3.937 9.4782 4900 2.6813 1.0 0.6453
3.1866 9.6718 5000 2.3876 1.0002 0.5372
3.1866 9.8654 5100 2.1678 1.0 0.4902
3.1866 10.0581 5200 1.9945 1.0002 0.4530
3.1866 10.2517 5300 1.8576 1.0 0.4270
3.1866 10.4453 5400 1.7788 1.0 0.4399
1.9458 10.6389 5500 1.6520 1.0 0.4094
1.9458 10.8325 5600 1.5545 1.0 0.3874
1.9458 11.0252 5700 1.4698 1.0 0.3800
1.9458 11.2188 5800 1.4052 1.0 0.3777
1.9458 11.4124 5900 1.3276 1.0 0.3658
1.4263 11.6060 6000 1.2710 1.0 0.3668
1.4263 11.7996 6100 1.2150 1.0 0.3536
1.4263 11.9932 6200 1.1586 1.0 0.3531
1.4263 12.1859 6300 1.1156 1.0 0.3519
1.4263 12.3795 6400 1.0729 1.0 0.3484
1.1212 12.5731 6500 1.0345 1.0 0.3467
1.1212 12.7667 6600 0.9887 1.0 0.3428
1.1212 12.9603 6700 0.9630 1.0 0.3417
1.1212 13.1530 6800 0.9260 1.0 0.3381
1.1212 13.3466 6900 0.9005 1.0 0.3397
0.9141 13.5402 7000 0.8764 1.0 0.3369
0.9141 13.7338 7100 0.8512 1.0 0.3363
0.9141 13.9274 7200 0.8273 1.0 0.3351
0.9141 14.1200 7300 0.8083 1.0 0.3329
0.9141 14.3136 7400 0.7851 0.9998 0.3300
0.7811 14.5073 7500 0.7743 1.0 0.3312
0.7811 14.7009 7600 0.7510 0.9998 0.3272
0.7811 14.8945 7700 0.7366 1.0 0.3267
0.7811 15.0871 7800 0.7290 1.0 0.3253
0.7811 15.2807 7900 0.7132 1.0 0.3247
0.6725 15.4743 8000 0.7190 1.0 0.3277
0.6725 15.6680 8100 0.7006 1.0 0.3241
0.6725 15.8616 8200 0.6835 1.0 0.3226
0.6725 16.0542 8300 0.6698 0.9998 0.3209
0.6725 16.2478 8400 0.6628 0.9998 0.3214
0.606 16.4414 8500 0.6538 1.0 0.3205
0.606 16.6350 8600 0.6523 1.0 0.3186
0.606 16.8287 8700 0.6449 1.0 0.3183
0.606 17.0213 8800 0.6401 1.0 0.3179
0.606 17.2149 8900 0.6333 1.0 0.3200
0.5492 17.4085 9000 0.6333 1.0 0.3201
0.5492 17.6021 9100 0.6219 1.0 0.3179
0.5492 17.7957 9200 0.6189 1.0 0.3201
0.5492 17.9894 9300 0.6023 0.9998 0.3166
0.5492 18.1820 9400 0.6084 1.0 0.3154
0.5057 18.3756 9500 0.6002 0.9998 0.3147
0.5057 18.5692 9600 0.5875 1.0 0.3128
0.5057 18.7628 9700 0.5903 0.9998 0.3138
0.5057 18.9564 9800 0.5930 1.0 0.3127
0.5057 19.1491 9900 0.5855 1.0 0.3141
0.4709 19.3427 10000 0.5880 1.0 0.3120
0.4709 19.5363 10100 0.5855 1.0 0.3131
0.4709 19.7299 10200 0.5734 1.0 0.3106
0.4709 19.9235 10300 0.5777 1.0 0.3109

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
14
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for utakumi/Hubert-common_voice_JSUT-ja-demo-kana

Finetuned
(28)
this model