whisper-small-khmer-v2

This model is a fine-tuned version of openai/whisper-small on the openslr, google/fleurs and km-speech-corpus dataset. It achieves the following results on the evaluation set:

Loss: 0.26
Wer: 0.6165

Model description

This model is fine-tuned with Google FLEURS, OpenSLR (SLR42) and km-speech-corpus dataset.

from transformers import pipeline

pipe = pipeline(
    task="automatic-speech-recognition",
    model="seanghay/whisper-small-khmer-v2",
)

result = pipe("audio.wav",
  generate_kwargs={
    "language":"<|km|>",
    "task":"transcribe"},
    batch_size=16
)

print(result["text"])

Downloads last month: 193

Safetensors

Model size

242M params

Tensor type

F32

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Datasets used to train seanghay/whisper-small-khmer-v2

Spaces using seanghay/whisper-small-khmer-v2 2

Evaluation results

Wer on Google FLEURS
test set self-reported

0.617

View on Papers With Code