Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1080

72b Eval model failed

#685

by paloalma - opened Apr 17, 2024

Discussion

paloalma

Apr 17, 2024

Hello again,

It seems even after repushing the models for a third time the benchmarks failed.

https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/d353428028d6a8a2a3484b09ac3675a72ac6fc9e https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/dfe493a00eacdd0d7a10c7312e9e9afee01e578d https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/b76ba68e5c5167a07193dc2dbd278f9af0751051
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/736318a7bf0d5d28e1042181d53141b7456e0555 https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/839301d9daf30d03a49555c96f0f16b411a30f2a https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/a690668f374d540d5584bb8fd711544fdfa98905

Could we please get the logs or see if there is another way to eval them? You mentionned several time that it was a network error but this is the third time resubmitting them and it seems to always get this error.

Thank you for your help,
André

clefourrier

Open LLM Leaderboard org Apr 17, 2024

Hi @paloalma !
As @alozowski mentioned, we encountered network problems this weekend/beginning of week, which prevented your models from being downloaded.
Thanks to your other issues, we were able to give logs to our infra team, who are now working on making our network even more stable to prevent timeouts.
Please note that the bigger your models, the more likely they are to (naturally) encounter network issues.

For the current failure, this is entirely a human error from us, as we managed to break the prod this morning while adding a small fix, so all models which ran in the last 4 hours probably failed and will be rescheduled asap (we're doing some last checks to make sure we reverted all accidental changes).

paloalma

Apr 17, 2024

Awesome, I understand much better now. Thank you, feel free to keep me updated, we are writting a research paper for an engineering school hence the need of benchmarking these models as they have been built with different and new methods. We will make sure to quote hugging face open llm leaderboard results in the research paper once we get the results.

clefourrier

Open LLM Leaderboard org Apr 18, 2024

Hi! @alozowski relaunched all models yesterday so I'm closing this issue. Feel free to reopen it if needed!

clefourrier changed discussion status to closed Apr 18, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment