You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
You agree to not use the model to conduct experiments that cause harm to human subjects.
Log in or Sign Up to review the conditions and access this model content.
NB GPT-J-6B NorPaca
This is a NB GPT-J-6B Norwegian Bokmål model fine-tuned on the NorPaca dataset.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline
base_model = "NbAiLab/nb-gpt-j-6B-norpaca"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model).cuda()
For generation, we can either use pipeline()
or the model's .generate()
method. Remember that the prompt needs a Norwegian template:
# Generate responses
def generate(instruction, input=None):
if input:
prompt = f"""Nedenfor er en instruksjon som beskriver en oppgave, sammen med et input som gir ytterligere kontekst. Skriv et svar som fullfører forespørselen på riktig måte.
### Instruksjon:
{instruction}
### Input:
{input}
### Respons:"""
else:
prompt = f""""Nedenfor er en instruksjon som beskriver en oppgave. Skriv et svar som fullfører forespørselen på riktig måte.
### Instruksjon:
{instruction}
### Respons:"""
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].cuda()
generation_output = model.generate(
input_ids=input_ids,
generation_config=GenerationConfig(temperature=0.2, top_p=0.75, num_beams=4),
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=256
)
for seq in generation_output.sequences:
output = tokenizer.decode(seq, skip_special_tokens=True)
print(output.split("### Respons:")[-1].strip())
generate("Skriv en e-post der du ønsker velkommen til en ny medarbeider ved navn Svein.")
Data
The dataset is a translation to Norwegian Bokmål of alpaca_gpt4_data.json, a clean version of the Alpaca dataset made at Stanford, but generated with GPT4.
This dataset cannot be used to create models that compete in any way with OpenAI.
Finetuning
To fine-tune the NB GPT-J-6B model we used the code available on NB's fork of mesh-transformer-jax
, which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 on a single TPUv3-8 for 3 hours on top of NB GPT-J-6B.
References
Hardware Requirements
For training we have used a Google Cloud TPUv3-8 VM. For eval, you can use a T4.
- Downloads last month
- 0