Finnish-NLP/whisper-large-finnish-v3 · Mel mismatch with faster-whisper

I have a very limited understanding, mainly experimenting. I understood that openai whisper v3 use 128 mel.

When I convert this model from safetensors to ct2 with a py script like this:

import ctranslate2 # type: ignore
from ctranslate2.converters import TransformersConverter # type: ignore

model_name_or_path = "Finnish-NLP/whisper-large-finnish-v3"
output_dir = "ct2/whisper-large-finnish-v3"

converter = TransformersConverter(model_name_or_path)
converter.convert(
output_dir,
quantization="float16"
)

Using the resulting model with faster-whisper fails due to mel mismatch:

ValueError: Invalid input features shape: expected an input with shape (1, 128, 3000), but got an input with shape (1, 80, 3000) instead

Does something go wrong with the conversion or what could be the cause for this.

Edit: seems using the -ct2 model you had available fixed this, so perhaps the converter just needs more parameters if you do it manually.