metadata

license: other
datasets:
  - blip-solutions/SlovAlpaca
language:
  - sk

SlovAlpaca

This repository contains the LORA weights finetuned on the translated version of the original Alpaca dataset (more info on the dataset card)

Training procedure

The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters:

MICRO_BATCH_SIZE = 3 
BATCH_SIZE = 128
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 2  # paper uses 3
LEARNING_RATE = 2e-5  # from the original paper
CUTOFF_LEN = 256  
LORA_R = 4
LORA_ALPHA = 16
LORA_DROPOUT = 0.05

The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results

@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, }

How to use:

Prerequisites

!pip install datasets loralib sentencepiece
!pip uninstall -y transformers
!pip install git+https://github.com/zphang/transformers@c3dc391#egg=transformers 
!pip install git+https://github.com/huggingface/peft.git
!pip install bitsandbytes

Load model:

from peft import PeftModel
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig

tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")

model = LLaMAForCausalLM.from_pretrained(
    "decapoda-research/llama-7b-hf",
    load_in_8bit=True,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, "blip-solutions/SlovAlpaca")

Generation

Here is a colab notebook for inference: https://colab.research.google.com/drive/1z4aMG7tGjchLBlg_iXDuqt3sH6bQRuQk?usp=sharing

PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:"""

inputs = tokenizer(
    PROMPT,
    return_tensors="pt",
)
input_ids = inputs["input_ids"].cuda()

generation_config = GenerationConfig(
    temperature=0.6,
    top_p=0.95,
    repetition_penalty=1.15,
)
print("Generating...")
generation_output = model.generate(
    input_ids=input_ids,
    generation_config=generation_config,
    return_dict_in_generate=True,
    output_scores=True,
    max_new_tokens=128,
)
for s in generation_output.sequences:
    print(tokenizer.decode(s))

Response:

Generating...
 Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:
Lamy žiju v horách, na poli, alebo v lesoch.