File size: 5,681 Bytes
27745ae
a3b83a7
dc13c0c
 
769f6c7
dc13c0c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27745ae
 
b5b9724
27745ae
a3b83a7
27745ae
a3b83a7
 
 
 
 
 
 
 
 
 
27745ae
a3b83a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d355af5
a3b83a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d355af5
a3b83a7
 
 
d355af5
a3b83a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d355af5
a3b83a7
c8ac16e
a3b83a7
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
license: other
license_name: gemma-terms-of-use
license_link: https://ai.google.dev/gemma/terms
base_model: google/gemma-7b
datasets:
- ravithejads/samvaad-hi-filtered
- Telugu-LLM-Labs/telugu_teknium_GPTeacher_general_instruct_filtered_romanized
- Telugu-LLM-Labs/telugu_alpaca_yahma_cleaned_filtered_romanized
- abhinand/tamil-alpaca
- Tensoic/airoboros-3.2_kn
- Tensoic/gpt-teacher_kn
- VishnuPJ/Alpaca_Instruct_Malayalam
- Tensoic/Alpaca-Gujarati
- HydraIndicLM/punjabi_alpaca_52K
- HydraIndicLM/bengali_alpaca_dolly_67k
- OdiaGenAI/Odia_Alpaca_instructions_52k
- yahma/alpaca-cleaned
language:
- te
- en
- ta
- ml
- hi
- kn
- gu
- bn
- pa
- or
library_name: transformers
pipeline_tag: text-generation
---

# Indic-gemma-7b-finetuned-sft-Navarasa

This model is based on [google/gemma-7b](https://huggingface.co/google/gemma-7b) and hase been LoRA finetuned on 9 Indian languages and English language instruction datasets:

1. #### Hindi - [ravithejads/samvaad-hi-filtered](https://huggingface.co/datasets/ravithejads/samvaad-hi-filtered), [HydraIndicLM/hindi_alpaca_dolly_67k](https://huggingface.co/datasets/HydraIndicLM/hindi_alpaca_dolly_67k)(sampled)
2. #### Telugu - [Telugu-LLM-Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized](https://huggingface.co/datasets/Telugu-LLM-Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized), [Telugu-LLM-Labs/teknium_GPTeacher_general_instruct_telugu_filtered_and_romanized](https://huggingface.co/datasets/Telugu-LLM-Labs/teknium_GPTeacher_general_instruct_telugu_filtered_and_romanized)
3. #### Tamil - [abhinand/tamil-alpaca](https://huggingface.co/datasets/abhinand/tamil-alpaca)
4. #### Kannada - [Tensoic/airoboros-3.2_kn](https://huggingface.co/datasets/Tensoic/airoboros-3.2_kn), [Tensoic/gpt-teacher_kn](https://huggingface.co/datasets/Tensoic/gpt-teacher_kn)
5. #### Malayalam - [VishnuPJ/Alpaca_Instruct_Malayalam](https://huggingface.co/datasets/VishnuPJ/Alpaca_Instruct_Malayalam)
6. #### Gujarati - [Tensoic/Alpaca-Gujarati](https://huggingface.co/datasets/Tensoic/Alpaca-Gujarati)
7. #### Punjabi - [HydraIndicLM/punjabi_alpaca_52K](https://huggingface.co/datasets/HydraIndicLM/punjabi_alpaca_52K)
8. #### Bengali - [HydraIndicLM/bengali_alpaca_dolly_67k](https://huggingface.co/datasets/HydraIndicLM/bengali_alpaca_dolly_67k)(alpaca filtered)
9. #### Odia - [OdiaGenAI/Odia_Alpaca_instructions_52k](https://huggingface.co/datasets/OdiaGenAI/Odia_Alpaca_instructions_52k), [OdiaGenAI/gpt-teacher-roleplay-odia-3k](https://huggingface.co/datasets/OdiaGenAI/gpt-teacher-roleplay-odia-3k)
10. #### English - [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned)

The model is finetuned using [unsloth](https://github.com/unslothai/unsloth) library and we provide inference code using the same for faster inference. Alternatively you can use HuggingFace Library for inference.

# Training Details:

The model is trained on approx 500K instruction samples.
1. GPU: 1 A100, 80GB
2. Time: 36.5 Hours
3. Platform: [E2E Networks](https://www.e2enetworks.com/)
# Installation

`!pip install "unsloth[colab-ampere] @git+https://github.com/unslothai/unsloth.git"`

# Input Text Format

```
### Instruction: {instruction}

### Input: {input}

## Response: {response}
```

# Inference With Unsloth

```python3
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = False 
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    device_map="auto"
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference

input_prompt = """
### Instruction:
{}

### Input:
{}

### Response:
{}"""

input_text = input_prompt.format(
        "Tranlsate following sentence to Hindi.", # instruction
        "This model is developed by Telugu LLM Labs", # input
        "", # output - leave this blank for generation!
    )

inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)
```

# Inference with HuggingFace

```python3
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

model = AutoPeftModelForCausalLM.from_pretrained(
    "Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa",
    load_in_4bit = False,
    token = hf_token
)
tokenizer = AutoTokenizer.from_pretrained("Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa")

input_prompt = """
### Instruction:
{}

### Input:
{}

### Response:
{}"""

input_text = input_prompt.format(
        "Tranlsate following sentence to Hindi.", # instruction
        "This model is developed by Telugu LLM Labs", # input
        "", # output - leave this blank for generation!
    )

inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
response = tokenizer.batch_decode(outputs)[0]
```

Refer to the [blog post](https://ravidesetty.medium.com/introducing-indic-gemma-7b-2b-instruction-tuned-model-on-9-indian-languages-navarasa-86bc81b4a282) for sample examples.

Please check our [Code Repository](https://github.com/TeluguLLMLabs/Indic-gemma-7b-Navarasa) for training and inference scripts.


# Developers:

The model is a collaborative effort by [Ravi Theja](https://twitter.com/ravithejads) and [Ramsri Goutham](https://twitter.com/ramsri_goutham). Feel free to DM either of us if you have any questions.