Llama-2-7b-vicuna-Chinese

Llama-2-7b-vicuna-Chinese是在中英双语sharegpt数据上全参数微调的对话模型。

Llama-2-7b-vicuna-Chinese is a chat model supervised finetuned on vicuna sharegpt data in both English and Chinese.

  • Foundation model: meta-llama/Llama-2-7b-hf, a commercially available language model.
  • Finetuning data: ShareGPT,ShareGPT-ZH,Langchain-MRKL-finetune
  • Training code: based on FastChat

主要改进:中英能力相比Llama2原版和vicuna均有提升

  • 英语能力基础评测(MMLU): Llama-2-7b-vicuna-Chinese(48.8) > Llama-2-7b(45.3) > vicuna1.1(44.8)
  • 中文能力基础评测(C-Eval): Llama-2-7b-vicuna-Chinese(34.7) > Llama-2-7b-chat(30.3)= vicuna1.1(30.3)
  • 实测效果:不会出现Llama2-chat的过于谨慎的状况
  • English Eval results(MMLU): Llama-2-7b-vicuna-Chinese(48.8) > Llama-2-7b(45.3) > vicuna1.1(44.8)
  • Chinese Eval results(C-Eval): Llama-2-7b-vicuna-Chinese(34.7) > Llama-2-7b-chat(30.3)= vicuna1.1(30.3)
  • Empirical Results: It will not be overly cautious like Llama2-chat.

Loading the Weights with Hugging Face Transformers

See the following example for usage.

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("PengQu/Llama-2-7b-vicuna-Chinese",use_fast=False)
model = AutoModelForCausalLM.from_pretrained("PengQu/Llama-2-7b-vicuna-Chinese").to("cuda")

instruction = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {} ASSISTANT:"
prompt = instruction.format('用flask写一个简单的http服务器。')
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")

generation_output = model.generate(input_ids=input_ids, max_new_tokens=512)
print(tokenizer.decode(generation_output[0],skip_special_tokens=True))

输出如下(output as follows):

要使 用Flask创建一个简单的HTTP服务器,您需要在终端中运行以下命令:

pip install Flask

这将安装Flask包并准备您使用它。
以下是一个简单的Flask应用程序,它侦听端口5000并响应“Hello, World!”消息:

from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
    return 'Hello, World!'
if __name__ == '__main__':
    app.run(debug=True)

要运行此应用程序,您可以将代码保存到名为`app.py`的文件中,然后在终端中运行以下命令:

python app.py

这将启动Flask服务器,并且您应该能够通过在Web浏览器中访问`http://localhost:5000`来访问“Hello, World!”消息。
您还可以使用Flask的路由功能为不同的URL定义不同的端点。例如,您可以定义一个路由,在端口8080上侦听,并响应“Hello, Flask!”消息:

from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
    return 'Hello, Flask!'
if __name__ == '__main__':
    app.run(debug=True, host='localhost', port=8080)

要运行此应用程序,您可以将代码保存到名为`app.py`的文件中,然后在终端中运行以下命令:

python app.py
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train PengQu/Llama-2-7b-vicuna-Chinese