Mistral-Ita for Generative Q&A
Overview
This model is a finely-tuned version of the Mistral_ita
specifically for the task of Q&A. It is designed to take questions and context as input and provide pertinent responses, or to indicate if a response cannot be deduced from the given context.
Model Capabilities
- Contextual Understanding: Can process both questions and their contextual information to generate relevant answers.
- Indicative Responses: Capable of signaling when the information provided is insufficient to derive an answer.
How to Use
For utilizing this model in a Q&A setting, provide it with a question and the related context. The model will analyze the input and either generate an appropriate response or indicate the lack of necessary information for answering.
How to Use
How to utilize my Mistral for Italian text generation
import transformers
from transformers import TextStreamer
import torch
MODEL_NAME = "Moxoff/Mistral_InfoSynth"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16, device_map="auto").eval()
def stream(user_prompt):
runtimeFlag = "cuda:0"
system_prompt = ''
B_INST, E_INST = "<s> [INST]", "[/INST]"
prompt = f"{system_prompt}{B_INST}{user_prompt.strip()}\n{E_INST}"
inputs = tokenizer([prompt], return_tensors="pt").to(runtimeFlag)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=300, temperature=0.0001,
repetition_penalty=1.2, eos_token_id=2, do_sample=True, num_return_sequences=1)
domanda = """Quanto è alta la torre di Pisa?"""
contesto = """
La Torre di Pisa è un campanile del XII secolo, famoso per la sua inclinazione. Alta circa 56 metri.
"""
prompt = f"Rispondi alla seguente domanda basandoti sul contesto fornito. Domanda: {domanda}, contesto: {contesto}"
stream(prompt)
GGUF VERSION
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.