xls-r-1b-bemgen-combined-model

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the BEMGEN - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2509
  • Wer: 0.3923

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
No log 0.1784 100 3.4413 1.0003
No log 0.3568 200 2.9149 1.0
No log 0.5352 300 0.7768 0.9235
No log 0.7136 400 0.6057 0.9047
5.3372 0.8921 500 0.4317 0.6720
5.3372 1.0696 600 0.3997 0.6704
5.3372 1.2480 700 0.3611 0.6405
5.3372 1.4264 800 0.3441 0.5603
5.3372 1.6048 900 0.2945 0.4914
0.6459 1.7832 1000 0.3041 0.4924
0.6459 1.9616 1100 0.2805 0.4681
0.6459 2.1392 1200 0.2774 0.5108
0.6459 2.3176 1300 0.2683 0.4254
0.6459 2.4960 1400 0.2644 0.4382
0.4599 2.6744 1500 0.2446 0.4142
0.4599 2.8528 1600 0.2473 0.4118
0.4599 3.0303 1700 0.2492 0.3961
0.4599 3.2087 1800 0.2467 0.4070
0.4599 3.3872 1900 0.2509 0.3923

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
15
Safetensors
Model size
963M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/xls-r-1b-bemgen-combined-model

Finetuned
(105)
this model