chuuhtetnaing's picture
Update README.md
f3de3c1 verified
metadata
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: whisper-small-myanmar
    results: []
datasets:
  - chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
  - my
pipeline_tag: automatic-speech-recognition
library_name: transformers

whisper-small-myanmar

This model is a fine-tuned version of openai/whisper-small on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1904
  • Wer: 49.0650

Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-small-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # α€€α€»α€½α€”α€Ία€™ α€•α€Όα€Šα€Ία€• မှာ α€•α€Šα€¬α€žα€„α€Ί တော့ စာမေးပွဲ α€€α€­α€― တပတ်တခါ α€…α€…α€Ία€α€šα€Ί

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.2566 1.0 36 0.8893 215.0045
0.8862 2.0 72 0.6243 388.6465
0.3546 3.0 108 0.2046 316.8744
0.1839 4.0 144 0.1695 81.3001
0.1198 5.0 180 0.1385 63.8914
0.0969 6.0 216 0.1583 66.0285
0.084 7.0 252 0.1539 70.6589
0.0628 8.0 288 0.1603 61.3090
0.0565 9.0 324 0.1424 60.3295
0.0355 10.0 360 0.1457 58.1478
0.0299 11.0 396 0.1547 57.7916
0.0183 12.0 432 0.1543 54.3633
0.0131 13.0 468 0.1532 54.1407
0.011 14.0 504 0.1604 53.8736
0.0083 15.0 540 0.1630 54.0516
0.0042 16.0 576 0.1711 52.1371
0.0034 17.0 612 0.1670 52.5824
0.0022 18.0 648 0.1649 52.5378
0.0013 19.0 684 0.1802 52.1817
0.0014 20.0 720 0.1820 53.1612
0.002 21.0 756 0.1792 52.7159
0.0016 22.0 792 0.1796 50.7124
0.0004 23.0 828 0.1803 50.4007
0.0003 24.0 864 0.1804 49.4657
0.0001 25.0 900 0.1819 49.2431
0.0 26.0 936 0.1857 49.0205
0.0 27.0 972 0.1879 49.1541
0.0 28.0 1008 0.1893 49.1095
0.0 29.0 1044 0.1901 49.1095
0.0 30.0 1080 0.1904 49.0650

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.15.1