Model Card for Model ID

Model Details

This is the model fine-tuned in this blog.

This model is fine-tuned on Qwen/Qwen2.5-3B, with BAAI/Infinity-Instruct dataset (subset 0625). You can find more details in the blog post.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "jlzhou/Qwen2.5-3B-Infinity-Instruct-0625"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Training Details

Training Data

This model is trained on https://huggingface.co/datasets/BAAI/Infinity-Instruct

Training Hyperparameters

This model follows the recommended hyperparameters from https://huggingface.co/BAAI/Infinity-Instruct-3M-0625-Qwen2-7B#training-details

Speeds, Sizes, Times [optional]

[More Information Needed]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 16.61
IFEval (0-Shot) 35.58
BBH (3-Shot) 26.91
MATH Lvl 5 (4-Shot) 2.04
GPQA (0-shot) 2.57
MuSR (0-shot) 8.13
MMLU-PRO (5-shot) 24.43
Downloads last month
24
Safetensors
Model size
3.09B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for jlzhou/Qwen2.5-3B-Infinity-Instruct-0625

Base model

Qwen/Qwen2.5-3B
Finetuned
(39)
this model
Quantizations
1 model

Dataset used to train jlzhou/Qwen2.5-3B-Infinity-Instruct-0625

Evaluation results