README.md · etemiz/Llama-3.1-405B-Inst-GGUF at e0b0d57399ab5e5e566a5c1d42553118034dbec4

metadata

license: llama3.1

Llama 3.1 405B Quants

llama.cpp version b3459. There is ongoing work in llama.cpp to support this model. If you use context = 8192 there are some reports that say this model works fine. If not, you can also try changing the Frequency Base as described in: https://www.reddit.com/r/LocalLLaMA/comments/1ectacp/until_the_rope_scaling_is_fixed_in_gguf_for/

Lmk if you need bigger quants.