metadata

license: apache-2.0
datasets:
  - shareAI/DPO-zh-en-emoji
language:
  - zh
  - en
pipeline_tag: question-answering
tags:
  - dpo
  - llama3.1
  - llama3
  - chat

llama3-instruct 中文DPO版

模型介绍

像原版instruct一样，喜欢用有趣中文和表情符号回答问题。
Github：https://github.com/CrazyBoyM/llama3-Chinese-chat
放出训练配方细节供网友参考分享： DPO(beta 0.5) + lora rank128, alpha256 + 打开"lm_head", "input_layernorm", "post_attention_layernorm", "norm"层训练.
特点：偏好中文和emoji表情，且不损伤原instruct版模型能力。实测中文DPO版问答性能体验超过现在市面上任何llama3中文微调版（微调会破坏llama3原版能力，导致遗忘）

模型部署

网页脚本文件：https://github.com/CrazyBoyM/llama3-Chinese-chat/blob/main/deploy/web_streamlit_for_instruct_v2.py

pip install streamlit
streamlit run web_streamlit_for_instruct_v2.py ./Llama3-Chinese-instruct-DPO-beta0.5

模型下载

SDK下载

#安装ModelScope
pip install modelscope

#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('baicai003/Llama3-Chinese-instruct-DPO-beta0.5')

Git下载

#Git模型下载
git clone https://www.modelscope.cn/baicai003/Llama3-Chinese-instruct-DPO-beta0.5.git