Context length is not 128k

#41

by pseudotensor - opened Mar 28, 2024

Discussion

pseudotensor

Mar 28, 2024

vllm uses default of 8k, and can't make it use 128k.

https://github.com/vllm-project/vllm/issues/3676

MrDragonFox

Mar 28, 2024

you can .. just change the config.json
but 128k would take over 130g vram alone .. i can only fit 64 in 96g

pseudotensor

Mar 28, 2024

As I argue in that vLLM thread. I don't think that's how it should be done. Shouldn't just change embedding size, since rope scaling is used. It should be part of the calculation.

shaily99

Jan 10

Based on the discussion on this post, we should use the --max_model_length when using vllm for this model to use the larger context window? Because by default, I am currently getting errors as it is using 8192 as the length.

shaily99

Jan 11

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment