Running
on
CPU Upgrade
11.4k
π
Open LLM Leaderboard 2
Track, rank and evaluate open LLMs and chatbots
Track, rank and evaluate open LLMs and chatbots
VLMEvalKit Evaluation Results Collection
More advanced and challenging multi-task evaluation