模型介绍

这个版本是基于mistral-large-instruct-2407模型,经过特殊处理的中文sft版。与原始的instruct版类似,模型对中文内容和emoji表情的处理更加亲和,确保问答性能与用户体验的优化。

特点: 优化了对中文和emoji表情的处理能力,不影响原有instruct版模型的能力。实测表明,这个中文sft版在问答性能上领先于llama3_1-405B 中文模型 demo demo1

训练细节

  • Lora rank128, alpha256 detail

模型下载

通过Git LFS克隆模型:

git lfs install
git clone https://huggingface.co/opencsg/CSG-Wukong-Chinese-Mistral-Large2-123B

Lora参数合并指南

实现lora参数的合并,需要使用以下python代码:

from transformers import AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-Large-Instruct-2407")
peft_model_id = "opencsg/CSG-Wukong-Chinese-Mistral-Large2-123B"
model = PeftModel.from_pretrained(base_model, peft_model_id)
model.merge_and_unload()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.