Jayveersinh-Raj
commited on
Commit
•
c6fdb92
1
Parent(s):
58d2fdf
Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,62 @@
|
|
1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
---
|
3 |
license: other
|
4 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- hi
|
4 |
+
- gu
|
5 |
+
- pa
|
6 |
+
- as
|
7 |
+
- ta
|
8 |
+
- mr
|
9 |
+
- bn
|
10 |
+
- te
|
11 |
+
- ml
|
12 |
+
- kn
|
13 |
+
---
|
14 |
+
Indic-Sentence-Completion
|
15 |
---
|
16 |
license: other
|
17 |
---
|
18 |
+
|
19 |
+
# Details
|
20 |
+
The model cannot be commercially used. It's a fine-tuned Bloom-3B in several Indian languages:
|
21 |
+
- Gujarati
|
22 |
+
- Marathi
|
23 |
+
- Bangali
|
24 |
+
- Punjabi
|
25 |
+
- Kannada
|
26 |
+
- Malayalam
|
27 |
+
- Telugu
|
28 |
+
- Tamil
|
29 |
+
- Hindi
|
30 |
+
|
31 |
+
# Architecture
|
32 |
+
Same as Bloom-3B, the model is decoder only.
|
33 |
+
|
34 |
+
# Motivation behind the model fine-tuning
|
35 |
+
- The model can be fine-tuned for any downstream task that requires the use of the aforementioned Indian languages
|
36 |
+
- PEFT LoRA is advised.
|
37 |
+
- Can be stacked with an Encoder if needed for any Sequence to Sequence task that requires aforementioned Indian languages
|
38 |
+
|
39 |
+
# Example of getting inference from the model
|
40 |
+
from transformers import AutoModel, AutoConfig, AutoModelForCausalLM, AutoTokenizer
|
41 |
+
|
42 |
+
# Path to the directory containing the model files
|
43 |
+
model_directory = "autopilot-ai/Indic-sentence-completion"
|
44 |
+
tokenizer = AutoTokenizer.from_pretrained(model_directory)
|
45 |
+
model = AutoModelForCausalLM.from_pretrained(
|
46 |
+
model_directory,
|
47 |
+
load_in_8bit=True,
|
48 |
+
device_map="auto",
|
49 |
+
)
|
50 |
+
|
51 |
+
# Load the model configuration
|
52 |
+
config = AutoConfig.from_pretrained(model_directory)
|
53 |
+
|
54 |
+
# Load the model
|
55 |
+
model = AutoModel.from_pretrained(model_directory, config=config)
|
56 |
+
batch = tokenizer("હેલો કેમ છો?", return_tensors='pt')
|
57 |
+
|
58 |
+
with torch.cuda.amp.autocast():
|
59 |
+
output_tokens = model.generate(**batch, max_new_tokens=10)
|
60 |
+
|
61 |
+
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
|
62 |
+
|