Jayveersinh-Raj commited on
Commit
c6fdb92
1 Parent(s): 58d2fdf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -1
README.md CHANGED
@@ -1,4 +1,62 @@
1
- Indic-GPT
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  ---
3
  license: other
4
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - hi
4
+ - gu
5
+ - pa
6
+ - as
7
+ - ta
8
+ - mr
9
+ - bn
10
+ - te
11
+ - ml
12
+ - kn
13
+ ---
14
+ Indic-Sentence-Completion
15
  ---
16
  license: other
17
  ---
18
+
19
+ # Details
20
+ The model cannot be commercially used. It's a fine-tuned Bloom-3B in several Indian languages:
21
+ - Gujarati
22
+ - Marathi
23
+ - Bangali
24
+ - Punjabi
25
+ - Kannada
26
+ - Malayalam
27
+ - Telugu
28
+ - Tamil
29
+ - Hindi
30
+
31
+ # Architecture
32
+ Same as Bloom-3B, the model is decoder only.
33
+
34
+ # Motivation behind the model fine-tuning
35
+ - The model can be fine-tuned for any downstream task that requires the use of the aforementioned Indian languages
36
+ - PEFT LoRA is advised.
37
+ - Can be stacked with an Encoder if needed for any Sequence to Sequence task that requires aforementioned Indian languages
38
+
39
+ # Example of getting inference from the model
40
+ from transformers import AutoModel, AutoConfig, AutoModelForCausalLM, AutoTokenizer
41
+
42
+ # Path to the directory containing the model files
43
+ model_directory = "autopilot-ai/Indic-sentence-completion"
44
+ tokenizer = AutoTokenizer.from_pretrained(model_directory)
45
+ model = AutoModelForCausalLM.from_pretrained(
46
+ model_directory,
47
+ load_in_8bit=True,
48
+ device_map="auto",
49
+ )
50
+
51
+ # Load the model configuration
52
+ config = AutoConfig.from_pretrained(model_directory)
53
+
54
+ # Load the model
55
+ model = AutoModel.from_pretrained(model_directory, config=config)
56
+ batch = tokenizer("હેલો કેમ છો?", return_tensors='pt')
57
+
58
+ with torch.cuda.amp.autocast():
59
+ output_tokens = model.generate(**batch, max_new_tokens=10)
60
+
61
+ print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
62
+