File size: 1,497 Bytes
a16e23b
 
e5b39f5
 
 
 
 
 
a16e23b
e5b39f5
229454a
 
94cd439
 
 
5243887
94cd439
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e5b39f5
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: llama2
datasets:
- bertin-project/alpaca-spanish
language:
- es
library_name: transformers
pipeline_tag: text-generation
---
## Llama 2-13b-alpaca-spanish LoRA
This is a LoRA for Llama 2 13B trained on a translated [alpaca dataset](https://huggingface.co/datasets/bertin-project/alpaca-spanish) on an attempt to improve spanish performance of the Llama-2 foundation model with a conversational focus.

Base model used was [The Bloke's Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16) trained in 4bit precision with an added padding token.

## Important INFO
The original Llama 2 model does not have a padding token, this came to be restrictive when training. To address this, I added a padding token to the tokenizer associated with the model.
```python
from transformers import LlamaTokenizer, LlamaForCausalLM

model_name = 'TheBloke/Llama-2-13B-fp16'

model = LlamaForCausalLM.from_pretrained(model_name).half()
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Add padding token
tokenizer.add_tokens(['<PAD>'])
tokenizer.pad_token = '<PAD>'

# Resizing the model
model.resize_token_embeddings(len(tokenizer))

padded_model_name = 'Llama-2-13B-fp16-padded'

# Save
tokenizer.save_pretrained(padded_model_name)
model.save_pretrained(padded_model_name)

```

| Training parameteres      |   |
| ----------- | ----------- |
| LoRA scale  |   2      |
| Epochs     | 0.75        |
| Learning Rate| 2e-5     |
| Warmup Steps| 100     |
| Loss    | 1.07     |