LLaMA2-Accessory now supports the inference and instruction finetuning (both full-parameter and PEFT like LoRA) of mixtral-8x7b-32kseqlen. It supports the load balancing loss and will add more MoE support soon. The document is available here
Cxxschanged discussion title from
Inference and Finetuning support
to Inference and Finetuning Implementation