Does the 32K context in this image apply to the llama-3 model?
no, there's plans to train qwen 72B
8k for llama-3, 32k for qwen.
· Sign up or log in to comment