Hubert-common_voice_JSUT-ja-demo-roma

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3850
  • Wer: 0.9984
  • Cer: 0.1901

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
No log 0.1934 100 16.2719 3.0918 3.3907
No log 0.3868 200 16.0057 2.8734 2.8937
No log 0.5803 300 15.3748 1.9646 1.6906
No log 0.7737 400 13.0439 1.0 0.9292
11.8785 0.9671 500 7.6485 1.0 0.9292
11.8785 1.1605 600 6.0074 1.0 0.9292
11.8785 1.3540 700 5.6465 1.0 0.9292
11.8785 1.5474 800 5.5026 1.0 0.9292
11.8785 1.7408 900 5.3618 1.0 0.9292
4.9912 1.9342 1000 5.2193 1.0 0.9292
4.9912 2.1277 1100 5.0688 1.0 0.9292
4.9912 2.3211 1200 4.9141 1.0 0.9292
4.9912 2.5145 1300 4.7537 1.0 0.9292
4.9912 2.7079 1400 4.5894 1.0 0.9292
4.3024 2.9014 1500 4.4225 1.0 0.9292
4.3024 3.0948 1600 4.2562 1.0 0.9292
4.3024 3.2882 1700 4.0944 1.0 0.9292
4.3024 3.4816 1800 3.9344 1.0 0.9292
4.3024 3.6750 1900 3.7835 1.0 0.9292
3.6966 3.8685 2000 3.6411 1.0 0.9292
3.6966 4.0619 2100 3.5156 1.0 0.9292
3.6966 4.2553 2200 3.3973 1.0 0.9292
3.6966 4.4487 2300 3.2909 1.0 0.9292
3.6966 4.6422 2400 3.1957 1.0 0.9292
3.2011 4.8356 2500 3.1159 1.0 0.9292
3.2011 5.0290 2600 3.0544 1.0 0.9292
3.2011 5.2224 2700 3.0039 1.0 0.9292
3.2011 5.4159 2800 2.9654 1.0 0.9292
3.2011 5.6093 2900 2.9387 1.0 0.9292
2.9439 5.8027 3000 2.9091 1.0 0.9292
2.9439 5.9961 3100 2.8868 1.0 0.9292
2.9439 6.1896 3200 2.8660 1.0 0.9292
2.9439 6.3830 3300 2.8533 1.0 0.9292
2.9439 6.5764 3400 2.7337 1.0 0.9292
2.7884 6.7698 3500 2.5230 1.0 0.9292
2.7884 6.9632 3600 2.2724 1.0 0.9182
2.7884 7.1567 3700 1.9633 1.0 0.6316
2.7884 7.3501 3800 1.5858 1.0 0.4242
2.7884 7.5435 3900 1.3510 0.9998 0.3861
1.7651 7.7369 4000 1.1917 0.9993 0.3334
1.7651 7.9304 4100 1.0716 0.9980 0.2982
1.7651 8.1238 4200 0.9762 0.9976 0.2782
1.7651 8.3172 4300 0.9044 0.9965 0.2596
1.7651 8.5106 4400 0.8529 0.9963 0.2566
0.9278 8.7041 4500 0.7958 0.9971 0.2466
0.9278 8.8975 4600 0.7535 0.9965 0.2435
0.9278 9.0909 4700 0.7190 0.9974 0.2403
0.9278 9.2843 4800 0.6800 0.9974 0.2356
0.9278 9.4778 4900 0.6568 0.9963 0.2330
0.6673 9.6712 5000 0.6318 0.9960 0.2329
0.6673 9.8646 5100 0.6132 0.9973 0.2293
0.6673 10.0580 5200 0.5896 0.9971 0.2261
0.6673 10.2515 5300 0.5743 0.9962 0.2231
0.6673 10.4449 5400 0.5562 0.9960 0.2215
0.5392 10.6383 5500 0.5473 0.9973 0.2237
0.5392 10.8317 5600 0.5307 0.9963 0.2185
0.5392 11.0251 5700 0.5195 0.9976 0.2173
0.5392 11.2186 5800 0.5090 0.9978 0.2164
0.5392 11.4120 5900 0.4979 0.9974 0.2135
0.4572 11.6054 6000 0.4901 0.9974 0.2127
0.4572 11.7988 6100 0.4872 0.9993 0.2137
0.4572 11.9923 6200 0.4754 0.9973 0.2119
0.4572 12.1857 6300 0.4724 0.9969 0.2120
0.4572 12.3791 6400 0.4650 0.9987 0.2088
0.41 12.5725 6500 0.4592 0.9976 0.2076
0.41 12.7660 6600 0.4503 0.9982 0.2064
0.41 12.9594 6700 0.4478 0.9963 0.2099
0.41 13.1528 6800 0.4496 0.9982 0.2061
0.41 13.3462 6900 0.4438 0.9982 0.2052
0.3688 13.5397 7000 0.4365 0.9991 0.2040
0.3688 13.7331 7100 0.4288 0.9980 0.2046
0.3688 13.9265 7200 0.4299 0.9982 0.2025
0.3688 14.1199 7300 0.4274 0.9985 0.2026
0.3688 14.3133 7400 0.4242 0.9984 0.2006
0.3394 14.5068 7500 0.4253 0.9971 0.2001
0.3394 14.7002 7600 0.4178 0.9974 0.1996
0.3394 14.8936 7700 0.4182 0.9984 0.2004
0.3394 15.0870 7800 0.4194 0.9971 0.1979
0.3394 15.2805 7900 0.4160 0.9978 0.1997
0.3157 15.4739 8000 0.4096 0.9974 0.2010
0.3157 15.6673 8100 0.4088 0.9978 0.1980
0.3157 15.8607 8200 0.4119 0.9984 0.1974
0.3157 16.0542 8300 0.4099 0.9984 0.1965
0.3157 16.2476 8400 0.4086 0.9985 0.1977
0.2917 16.4410 8500 0.4097 0.9984 0.1968
0.2917 16.6344 8600 0.4113 0.9980 0.1949
0.2917 16.8279 8700 0.4018 0.9984 0.1956
0.2917 17.0213 8800 0.4043 0.9984 0.1934
0.2917 17.2147 8900 0.4046 0.9980 0.1946
0.2785 17.4081 9000 0.4046 0.9982 0.1927
0.2785 17.6015 9100 0.4016 0.9989 0.1948
0.2785 17.7950 9200 0.4013 0.9984 0.1922
0.2785 17.9884 9300 0.3879 0.9989 0.1930
0.2785 18.1818 9400 0.4009 0.9980 0.1928
0.2647 18.3752 9500 0.3904 0.9985 0.1926
0.2647 18.5687 9600 0.3944 0.9984 0.1959
0.2647 18.7621 9700 0.3957 0.9989 0.1959
0.2647 18.9555 9800 0.3949 0.9982 0.1938
0.2647 19.1489 9900 0.4039 0.9973 0.1933
0.248 19.3424 10000 0.4082 0.9991 0.1934
0.248 19.5358 10100 0.4074 0.9993 0.1922
0.248 19.7292 10200 0.3955 0.9989 0.1906
0.248 19.9226 10300 0.3856 0.9980 0.1909

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
6
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for utakumi/Hubert-common_voice_JSUT-ja-demo-roma

Finetuned
(21)
this model