metadata

license: apache-2.0
base_model: openai/whisper-tiny
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: whisper-tiny-myanmar
    results: []
datasets:
  - chuuhtetnaing/myanmar-speech-dataset-openslr-80
language:
  - my
pipeline_tag: automatic-speech-recognition
library_name: transformers

whisper-tiny-myanmar

This model is a fine-tuned version of openai/whisper-tiny on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

Loss: 0.2353
Wer: 61.8878

Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-tiny-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # ကျွန်မ ပြည်ပ မှာ ပညာ သင် တော့ စာမြီးပွဲ ကို တပတ်တခါ စစ်တယ်

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	1.0	18	1.2679	357.6135
1.483	2.0	36	1.0660	102.5378
1.0703	3.0	54	0.9530	106.3669
1.0703	4.0	72	0.8399	100.5343
0.8951	5.0	90	0.7728	107.6581
0.7857	6.0	108	0.7143	107.5245
0.6614	7.0	126	0.5174	104.4078
0.6614	8.0	144	0.3004	90.3384
0.3519	9.0	162	0.2447	82.4577
0.2165	10.0	180	0.2333	83.8825
0.2165	11.0	198	0.2022	77.0258
0.1532	12.0	216	0.1759	73.0632
0.1039	13.0	234	0.1852	72.0837
0.0675	14.0	252	0.1902	71.2823
0.0675	15.0	270	0.1882	70.5254
0.0517	16.0	288	0.2002	69.7240
0.0522	17.0	306	0.1965	67.7649
0.0522	18.0	324	0.1935	68.2102
0.0404	19.0	342	0.2132	67.9430
0.0308	20.0	360	0.2110	66.6963
0.0236	21.0	378	0.2141	65.9394
0.0236	22.0	396	0.2200	64.4702
0.0116	23.0	414	0.2227	63.4016
0.0055	24.0	432	0.2244	64.1585
0.0025	25.0	450	0.2254	62.4666
0.0025	26.0	468	0.2282	63.1790
0.0006	27.0	486	0.2320	61.7097
0.0002	28.0	504	0.2342	62.0659
0.0002	29.0	522	0.2350	62.0214
0.0001	30.0	540	0.2353	61.8878

Framework versions

Transformers 4.35.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.15.1