Model Architecture

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

MODEL_NAME = "DeepMount00/Llama-3.1-8b-Ita"

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16).eval()
model.to(device)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

def generate_answer(prompt):
    messages = [
        {"role": "user", "content": prompt},
    ]
    model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
    generated_ids = model.generate(model_inputs, max_new_tokens=200, do_sample=True,
                                          temperature=0.001)
    decoded = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
    return decoded[0]

prompt = "Come si apre un file json in python?"
answer = generate_answer(prompt)
print(answer)

Developer

[Michele Montebovi]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 28.23
IFEval (0-Shot) 79.17
BBH (3-Shot) 30.93
MATH Lvl 5 (4-Shot) 10.88
GPQA (0-shot) 5.03
MuSR (0-shot) 11.40
MMLU-PRO (5-shot) 31.96
Downloads last month
9,347
Safetensors
Model size
8.03B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DeepMount00/Llama-3.1-8b-ITA

Finetuned
(874)
this model
Quantizations
6 models

Spaces using DeepMount00/Llama-3.1-8b-ITA 6

Collection including DeepMount00/Llama-3.1-8b-ITA

Evaluation results