seastar105's picture
Update README.md
ff79c8a verified
|
raw
history blame
2.52 kB
metadata
library_name: transformers
tags: []

Model Description

OpenAI의 whisper-base λͺ¨λΈμ„ μ•„λž˜ λ°μ΄ν„°μ…‹μœΌλ‘œ ν•™μŠ΅ν•œ λͺ¨λΈμž…λ‹ˆλ‹€.

train_steps: 20000
warmup_steps: 2000
lr scheduler: linear warmup cosine decay
max learning rate: 1e-4
batch size: 256
max_grad_norm: 1.0
adamw_beta1: 0.9
adamw_beta2: 0.98

Evaluation

https://github.com/rtzr/Awesome-Korean-Speech-Recognition

μœ„ λ ˆν¬μ§€ν† λ¦¬μ—μ„œ μ£Όμš” μ˜μ—­λ³„ 회의 μŒμ„±μ„ μ œμ™Έν•œ ν…ŒμŠ€νŠΈμ…‹ κ²°κ³Όμž…λ‹ˆλ‹€. μ•„λž˜ ν…Œμ΄λΈ”μ—μ„œ whisper_base_komixv2κ°€ λ³Έ λͺ¨λΈ μ„±λŠ₯μž…λ‹ˆλ‹€.

Model cv_15_ko fleurs_ko kcall_testset kconf_test kcounsel_test klec_testset kspon_clean kspon_other
whisper_base 21.16 11.89 42.56 27.62 22.24 28.65 30.41 27.02
whisper_base_komix 15.42 7.16 20.86 14.24 12.64 13.44 12.26 12.12
whisper_base_komixv2 13.04 7.04 10.54 13.1 10.65 12.99 12.44 12.56
whisper_large_v3 5.11 3.72 5.45 9.35 3.83 8.46 15.08 12.89
whisper_turbo 5.38 3.95 5.89 9.77 4.21 9.27 16.49 13.54
whisper_turbo_lora 6.25 4.0 6.51 9.94 5.05 8.84 9.35 9.29

Acknowledgement

  • λ³Έ λͺ¨λΈμ€ κ΅¬κΈ€μ˜ TRC ν”„λ‘œκ·Έλž¨μ˜ μ§€μ›μœΌλ‘œ ν•™μŠ΅ν–ˆμŠ΅λ‹ˆλ‹€.
  • Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)