|
--- |
|
license: llama2 |
|
tags: |
|
- text2text-generation |
|
pipeline_tag: text2text-generation |
|
language: |
|
- zh |
|
- en |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
## Welcome |
|
If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE ! |
|
|
|
## Model description |
|
This model is obtained by fine-tuning the complete parameters using 0.4M Chinese instruction data on the original Llama2-13B-chat. |
|
We firmly believe that the original Llama2-chat exhibits commendable performance post Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). |
|
Our pursuit continues to be the further enhancement of this model using Chinese instructional data for fine-tuning, with an aspiration to facilitate stable and high-quality |
|
Chinese language outputs. |
|
## Use model |
|
Please note that the input should be formatted as follows in both **training** and **inference**. |
|
``` python |
|
Human: \n{input}\n\nAssistant:\n |
|
``` |
|
|
|
|
|
After you decrypt the files, BELLE-Llama2-13B-chat-0.4M can be easily loaded with AutoModelForCausalLM. |
|
``` python |
|
from transformers import AutoModelForCausalLM, LlamaTokenizer |
|
import torch |
|
|
|
ckpt = '/path/to_finetuned_model/' |
|
device = torch.device('cuda') |
|
model = AutoModelForCausalLM.from_pretrained(ckpt).half().to(device) |
|
tokenizer = LlamaTokenizer.from_pretrained(ckpt) |
|
prompt = "Human: \n写一首中文歌曲,赞美大自然 \n\nAssistant: \n" |
|
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) |
|
generate_ids = model.generate(input_ids, max_new_tokens=1024, do_sample=True, top_k=30, top_p=0.85, temperature=0.5, repetition_penalty=1.2, eos_token_id=2, bos_token_id=1, pad_token_id=0) |
|
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] |
|
response = output[len(prompt):] |
|
print(response) |
|
|
|
``` |
|
|
|
|
|
## Limitations |
|
There still exists a few issues in the model trained on current base model and data: |
|
|
|
1. The model might generate factual errors when asked to follow instructions related to facts. |
|
|
|
2. Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions. |
|
|
|
3. Needs improvements on reasoning and coding. |
|
|
|
Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed. |