Qwen/Qwen2-57B-A14B-Instruct
#115
by
yttria
- opened
ggerganov said intermediate_size should not be set to 20480, instead the convert-hf-to-gguf.py script has now been fixed to take into account the moe_intermediate_size and shared_expert_intermediate_size fields from config.json. Would using the new version without changing intermediate_size make a better quant?
https://github.com/ggerganov/llama.cpp/issues/7816#issuecomment-2155898007
Sure, thanks for bringing this to my attention (especially for the link)!
Should be in the queue, and done in the next few days at most.
mradermacher
changed discussion status to
closed
Failed again. I now also think that the 57B model is simply broken (see the bug report you linked).