--- license: llama2 datasets: - bertin-project/alpaca-spanish language: - es library_name: transformers pipeline_tag: text-generation --- ## Llama 2-13b-alpaca-spanish LoRA This is a LoRA for Llama 2 13B trained on a translated [alpaca dataset](https://huggingface.co/datasets/bertin-project/alpaca-spanish) on an attempt to improve spanish performance of the Llama-2 foundation model with a conversational focus. Base model used was [The Bloke's Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16) trained in 4bit precision with an added padding token. ## Important INFO The original Llama 2 model does not have a padding token, this came to be restrictive when training. To address this, I added a padding token to the tokenizer associated with the model. ```python from transformers import LlamaTokenizer, LlamaForCausalLM model_name = 'TheBloke/Llama-2-13B-fp16' model = LlamaForCausalLM.from_pretrained(model_name).half() tokenizer = LlamaTokenizer.from_pretrained(model_name) # Add padding token tokenizer.add_tokens(['']) tokenizer.pad_token = '' # Resizing the model model.resize_token_embeddings(len(tokenizer)) padded_model_name = 'Llama-2-13B-fp16-padded' # Save tokenizer.save_pretrained(padded_model_name) model.save_pretrained(padded_model_name) ``` | Training parameteres | | | ----------- | ----------- | | LoRA scale | 2 | | Epochs | 0.75 | | Learning Rate| 2e-5 | | Warmup Steps| 100 | | Loss | 1.07 |