question about output length
#59
by
skylerr
- opened
I set the max_new_tokens=2048, and evaluate the model in math 500, but some responses cannot be completed, compared to qwen 2.5, I find the response now is too long, what can I do to cut down the generated answers?
I am facing a similar issue in evaluating that model. I observe that all those reasoning models (o1, o3, deepseek, qwq) suffer from very long generations to reach EOS. I believe it is part of how they were trained to reason and we cannot do anything about it at the moment besides let it generate many tokens...