ibm-granite/granite-3.1-8b-instruct · Model Continuously Generating Text After Completing Task

Hi,

Thank you for the great work and for open-sourcing the models!

I’ve noticed that sometimes the model keeps generating text indefinitely, even after it has answered the question or completed the task. It seems to forget to add an EOS token to stop the generation, leading to unnecessary token usage and making it harder to use in real applications.

I’m currently running the model with vLLM, and to prevent this issue, I have to set the max_tokens argument and post-process the responses.

Have you encountered this behavior before? Do you know why it happens, and is there a way to fix it?

Best,
Endri