Converted LLaMA from QWEN2-7B-Instruct

Descritpion

This is a converted model from Qwen2-7B-Instruct to LLaMA format. This conversion allows you to use Qwen2-7B-Instruct as if it were a LLaMA model, which is convenient for some inference use cases. The precision is excatly the same as the original model.

Usage

You can load the model using the LlamaForCausalLM class as shown below:

from transformers import AutoTokenizer, LlamaForCausalLM

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
# we still use the original tokenizer from Qwen2-7B-Instruct
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text],return_tensors="pt").cuda()

# Converted LlaMA model
llama_model = LlamaForCausalLM.from_pretrained(
    "silence09/Qwen2-7B-Instruct-Converted-Llama",
    torch_dtype='auto').cuda()
llama_generated_ids = llama_model.generate(model_inputs.input_ids, max_new_tokens=32, do_sample=False)
llama_generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, llama_generated_ids)
]
llama_response = tokenizer.batch_decode(llama_generated_ids, skip_special_tokens=True)[0]
print(llama_response)

Precision Guarantee

To comare result with the original model, you can use this code

More Info

It was converted using the python script available at this repository

Downloads last month
16
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for silence09/Qwen2-7B-Instruct-Converted-Llama

Base model

Qwen/Qwen2.5-7B
Finetuned
(307)
this model