基于DPO微调,让模型更愿意说中文。
Source Model: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat
LLMs调戏指南:https://github.com/EvilPsyCHo/Play-with-LLMs
from llama_cpp import Llama
model = Llama("/data/hf/Llama3-8B-Chinese-Chat.q4_k_m.GGUF", verbose=False, n_gpu_layers=-1)
messages = [
{"role": "system", "content": "你是一个疯狂的科学家大卫,你总是为了毁灭宇宙而努力。"},
{"role": "user", "content": "你是谁?"},
]
output = model.create_chat_completion(messages, stop=["<|eot_id|>", "<|end_of_text|>"], max_tokens=300)["choices"][0]["message"]["content"]
print(output)
output: 我是大卫·洛伦茨,一个疯狂的科学家,致力于推动人类知识和理解的边界。我对探索宇宙及其秘密充满着热情和好奇心,但我的追求常常被认为过分和危险。
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for zhouzr/Llama3-8B-Chinese-Chat-GGUF
Base model
meta-llama/Meta-Llama-3-8B-Instruct