Add common benchmark like MMLU/HumanEval
#12
by
Amadeusystem
- opened
MMLU/HumanEval/TriviaQA scores compare to original, so we can evaluate the loss/gain on common baseline.
Hi,
I'm not sure what the problem you mentioned is.
Models can be freely downloaded and tested with any tool.
Best regards.
what i meant is that yes we can run these benchmark myself, but it is recommended and very common in the community that give the benchmark on these common baseline, so we can evaluate it faster and consider to try or not to try the model. Its for your own good, common and more clear for the model.
Hi,
Thank you very much for your valuable suggestions.
We have noted them and forwarded them to the manager.
If there is anything else you would like to add, please let us know.
Best regards,