chuuhtetnaing's picture
Update README.md
cbe5f88 verified
metadata
license: apache-2.0
base_model: openai/whisper-tiny
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: whisper-tiny-myanmar
    results: []
datasets:
  - chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
  - my
pipeline_tag: automatic-speech-recognition
library_name: transformers

whisper-tiny-myanmar

This model is a fine-tuned version of openai/whisper-tiny on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2353
  • Wer: 61.8878

Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-tiny-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျွန်မ ပြည်ပ မှာ ပညာ သင် တော့ စာမြီးပွဲ ကို တပတ်တခါ စစ်တယ်

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
No log 1.0 18 1.2679 357.6135
1.483 2.0 36 1.0660 102.5378
1.0703 3.0 54 0.9530 106.3669
1.0703 4.0 72 0.8399 100.5343
0.8951 5.0 90 0.7728 107.6581
0.7857 6.0 108 0.7143 107.5245
0.6614 7.0 126 0.5174 104.4078
0.6614 8.0 144 0.3004 90.3384
0.3519 9.0 162 0.2447 82.4577
0.2165 10.0 180 0.2333 83.8825
0.2165 11.0 198 0.2022 77.0258
0.1532 12.0 216 0.1759 73.0632
0.1039 13.0 234 0.1852 72.0837
0.0675 14.0 252 0.1902 71.2823
0.0675 15.0 270 0.1882 70.5254
0.0517 16.0 288 0.2002 69.7240
0.0522 17.0 306 0.1965 67.7649
0.0522 18.0 324 0.1935 68.2102
0.0404 19.0 342 0.2132 67.9430
0.0308 20.0 360 0.2110 66.6963
0.0236 21.0 378 0.2141 65.9394
0.0236 22.0 396 0.2200 64.4702
0.0116 23.0 414 0.2227 63.4016
0.0055 24.0 432 0.2244 64.1585
0.0025 25.0 450 0.2254 62.4666
0.0025 26.0 468 0.2282 63.1790
0.0006 27.0 486 0.2320 61.7097
0.0002 28.0 504 0.2342 62.0659
0.0002 29.0 522 0.2350 62.0214
0.0001 30.0 540 0.2353 61.8878

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.15.1