xls-r-1b-bigcgen-combined-15hrs-model

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the BIGCGEN - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5857
  • Wer: 0.6274

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss Wer
No log 0.1410 100 3.8893 1.0
No log 0.2821 200 2.6866 1.0
No log 0.4231 300 1.4204 1.0
No log 0.5642 400 0.8780 0.8847
5.5784 0.7052 500 0.8821 0.9583
5.5784 0.8463 600 0.6898 0.7509
5.5784 0.9873 700 0.6910 0.8690
5.5784 1.1283 800 0.6632 0.6810
5.5784 1.2694 900 0.6165 0.6048
1.2954 1.4104 1000 0.6006 0.6134
1.2954 1.5515 1100 0.6859 0.7684
1.2954 1.6925 1200 0.5857 0.6273
1.2954 1.8336 1300 0.6305 0.6155
1.2954 1.9746 1400 0.6213 0.5856
1.1582 2.1157 1500 0.5891 0.5984
1.1582 2.2567 1600 0.6606 0.7041

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
14
Safetensors
Model size
963M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/xls-r-1b-bigcgen-combined-15hrs-model

Finetuned
(100)
this model