question about output length

#59

by skylerr - opened 12 days ago

12 days ago

I set the max_new_tokens=2048, and evaluate the model in math 500, but some responses cannot be completed, compared to qwen 2.5, I find the response now is too long, what can I do to cut down the generated answers?

Ido-Amit198

2 days ago

I am facing a similar issue in evaluating that model. I observe that all those reasoning models (o1, o3, deepseek, qwq) suffer from very long generations to reach EOS. I believe it is part of how they were trained to reason and we cannot do anything about it at the moment besides let it generate many tokens...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment