Hubert-common_voice-kana-debug

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9982
  • Wer: 1.0
  • Cer: 0.9940

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
No log 0.7092 100 10.1087 1.0 0.9940
No log 1.4184 200 4.1507 1.0 0.9940
No log 2.1277 300 4.0102 1.0 0.9940
No log 2.8369 400 4.0116 1.0 0.9940
7.2217 3.5461 500 4.0141 1.0 0.9940
7.2217 4.2553 600 4.0104 1.0 0.9940
7.2217 4.9645 700 4.0064 1.0 0.9940
7.2217 5.6738 800 4.0104 1.0 0.9940
7.2217 6.3830 900 3.9990 1.0 0.9943
4.0004 7.0922 1000 4.0107 1.0 0.9940
4.0004 7.8014 1100 4.0116 1.0 0.9940
4.0004 8.5106 1200 4.0030 1.0 0.9943
4.0004 9.2199 1300 4.0032 1.0 0.9940
4.0004 9.9291 1400 4.0105 1.0 0.9940
3.997 10.6383 1500 3.9983 1.0 0.9940
3.997 11.3475 1600 4.0067 1.0 0.9943
3.997 12.0567 1700 4.0020 1.0 0.9940
3.997 12.7660 1800 4.0080 1.0 0.9940
3.997 13.4752 1900 4.0015 1.0 0.9940
3.9966 14.1844 2000 3.9996 1.0 0.9940
3.9966 14.8936 2100 4.0013 1.0 0.9940
3.9966 15.6028 2200 4.0080 1.0 0.9943
3.9966 16.3121 2300 4.0041 1.0 0.9940
3.9966 17.0213 2400 4.0024 1.0 0.9940
3.9954 17.7305 2500 4.0019 1.0 0.9943
3.9954 18.4397 2600 3.9999 1.0 0.9940
3.9954 19.1489 2700 4.0030 1.0 0.9940
3.9954 19.8582 2800 4.0091 1.0 0.9940
3.9954 20.5674 2900 4.0078 1.0 0.9940
3.9936 21.2766 3000 4.0039 1.0 0.9940
3.9936 21.9858 3100 4.0021 1.0 0.9940
3.9936 22.6950 3200 4.0009 1.0 0.9943
3.9936 23.4043 3300 4.0020 1.0 0.9940
3.9936 24.1135 3400 4.0010 1.0 0.9943
3.9915 24.8227 3500 4.0006 1.0 0.9940
3.9915 25.5319 3600 3.9970 1.0 0.9940
3.9915 26.2411 3700 4.0080 1.0 0.9940
3.9915 26.9504 3800 4.0023 1.0 0.9940
3.9915 27.6596 3900 4.0005 1.0 0.9943
3.9927 28.3688 4000 3.9991 1.0 0.9940
3.9927 29.0780 4100 4.0035 1.0 0.9940
3.9927 29.7872 4200 3.9981 1.0 0.9943

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
6
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for utakumi/Hubert-common_voice-kana-debug

Finetuned
(28)
this model