Spaces:
Running
on
CPU Upgrade
How long does it take to run these tests?
Some of the models under currently running have been there for over a week. I suspect there is an issue or delay or something? Is it suppose to take 1-2+ weeks to run a model though this benchmark here? I did not personally submit any tests. But the leader board seems like it's making no progress. lol Anyone know of any alternative sites or links?
https://github.com/aigoopy/llm-jeopardy
This leaderboard is far superior. The open_llm_leaderboard hasn't been up-to-date in ages, if ever.
Thank you a lot TNTOutburst! I will check it out.
Some models has been in running status for literally 50 days. Also several days ago they said to be adding human/GPT4 eval tab but nothing released yet.
https://github.com/aigoopy/llm-jeopardy
This leaderboard is far superior. The open_llm_leaderboard hasn't been up-to-date in ages, if ever.
That is great,but seems like llama based model only
Hi
@Goldenblood56
!
The leaderboard has been hanging because 1) we have been changing the backend to make it faster 2) we spent a week investigating the MMLU score disrepancies (btw, did you see our blog post?) 3) we had to re-run all the models already in the leaderboard because of these disrepancies - we are doing our best to do this as fast as we can! 🤗
Hi @Goldenblood56 !
The leaderboard has been hanging because 1) we have been changing the backend to make it faster 2) we spent a week investigating the MMLU score disrepancies 3) we had to re-run all the models already in the leaderboard because of these disrepancies - we are doing our best to do this as fast as we can! 🤗
Thanks for the update and hard work.