Whisper Large V2 Portuguese π§π·π΅πΉ
Bem-vindo ao whisper large-v2 para transcrição em portuguΓͺs ππ»
Transcribe Portuguese audio to text with the highest precision.
- Loss: 0.282
- Wer: 5.590
This model is a fine-tuned version of openai/whisper-large-v2 on the mozilla-foundation/common_voice_11 dataset. If you want a lighter model, you may be interested in jlondonobo/whisper-medium-pt. It achieves faster inference with almost no difference in WER.
Comparable models
Reported WER is based on the evaluation subset of Common Voice.
Model | WER | # Parameters |
---|---|---|
jlondonobo/whisper-large-v2-pt | 5.590 π€ | 1550M |
openai/whisper-large-v2 | 6.300 | 1550M |
jlondonobo/whisper-medium-pt | 6.579 | 769M |
jonatasgrosman/wav2vec2-large-xlsr-53-portuguese | 11.310 | 317M |
Edresson/wav2vec2-large-xlsr-coraa-portuguese | 20.080 | 317M |
Training hyperparameters
We used the following hyperparameters for training:
learning_rate
: 1e-05train_batch_size
: 16eval_batch_size
: 8seed
: 42gradient_accumulation_steps
: 2total_train_batch_size
: 32optimizer
: Adam with betas=(0.9,0.999) and epsilon=1e-08lr_scheduler_type
: linearlr_scheduler_warmup_steps
: 500training_steps
: 5000mixed_precision_training
: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
0.0828 | 1.09 | 1000 | 0.1868 | 6.778 |
0.0241 | 3.07 | 2000 | 0.2057 | 6.109 |
0.0084 | 5.06 | 3000 | 0.2367 | 6.029 |
0.0015 | 7.04 | 4000 | 0.2469 | 5.709 |
0.0009 | 9.02 | 5000 | 0.2821 | 5.590 π€ |
Framework versions
- Transformers 4.26.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.7.1.dev0
- Tokenizers 0.13.2
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train jlondonobo/whisper-large-v2-pt
Space using jlondonobo/whisper-large-v2-pt 1
Evaluation results
- Wer on mozilla-foundation/common_voice_11_0 pttest set self-reported5.590