Needs max_model_len attribute in config.json

#12

by shaily99 - opened 1 day ago

1 day ago

vllm uses 8k context length by default, and cannot use the 128K.
There has been discussions on several vllm issues. It seems they added a fix to allow the higher model Len, as long as there is a max_model_len key in the config.
It seems that this key was added to the unquantized version of command R but has not been added to this quantized repo.
Ref: https://github.com/vllm-project/vllm/issues/3676#issuecomment-2026793168

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment