why the model is returng biasedbiasedbiasedbiasedbiasedbiasedbiasedbiased!!!!!!
Define quantization configuration
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
)
​
Model path
mistral_models_path = "mistralai/Mistral-Nemo-Instruct-2407"
​
Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(
mistral_models_path,
cache_dir=r'D:\ai\mistral_models\bosta'
)
​
Load model with updated configurations
model = AutoModelForCausalLM.from_pretrained(
mistral_models_path,
torch_dtype=torch.bfloat16,
​
quantization_config=quantization_config,
cache_dir=r'D:\ai\mistral_models\bosta'
)
conversation = [{"role": "user", "content": "tell me a history of a bread"}]
tools = [get_current_weather]
​
format and tokenize the tool use prompt
inputs = tokenizer.apply_chat_template(
conversation,
​
add_generation_prompt=True,
return_dict=True,
return_tensors="pt",
)
​
​
​
inputs.to(model.device)
a = model.generate(**inputs, max_new_tokens=50, temperature=0.7,
top_k=50,
top_p=0.9)
print(tokenizer.decode(a[0], skip_special_tokens=True))
#return
Setting pad_token_id
to eos_token_id
:2 for open-end generation.
C:\Users\joaom\AppData\Roaming\Python\Python311\site-packages\transformers\generation\utils.py:1259: UserWarning: Using the model-agnostic default max_length
(=20) to control the generation length. We recommend setting max_new_tokens
to control the maximum length of the generation.
warnings.warn(
tell me a history of a breadSurebiasedbiasedbiasedbiasedbiasedbiasedbiasedbiased