Running 53 π Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks
Running on CPU Upgrade 80 π Open LLM Leaderboard Model Comparator Compare Open LLM Leaderboard results