Getting different results using command line and code
when using command line
python -m mlx_lm.generate --model mlx-community/Meta-Llama-3-8B-Instruct-4bit --prompt "hello"
“Hello! It’s nice to meet you. Is there something I can help you with, or would you like to chat?”
but if I do this in python
"""
from mlx_lm import load, generate
model, tokenizer = load("mlx-community/Meta-Llama-3-8B-Instruct-4bit")
response = generate(model, tokenizer, prompt="hello", verbose=True)
"""
“playing the role of the “bad guy” in the story. The other characters are all playing the role of the “good guy” in the story. The story is about a group of people who are trying to save the world from an alien invasion. The main character, who is playing the role of the “bad guy,” is a powerful and evil alien who is leading the invasion. The other characters are all working together to stop the alien and save the world. The story is full of action”
It’s an instruct model
It uses a chat_template.
prompt = tokenizer.apply_chat_template(
[{"role": "user", "content": "hello"}],
tokenize=False,
add_generation_prompt=True,
)
generate(…)