Hi~, I tried to reproduce the metrics you reported by running transformers/examples/research_projects/codeparrot/scripts/human_eval.py, but the results have significant deviations. Can you reproduce the results on your end?
· Sign up or log in to comment