autopilot-ai
/

Indic-sentence-completion

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Jayveersinh-Raj commited on Jun 25, 2023

Commit

c6fdb92

•

1 Parent(s): 58d2fdf

Update README.md

Files changed (1) hide show

README.md +59 -1

README.md CHANGED Viewed

@@ -1,4 +1,62 @@
-Indic-GPT
 ---
 license: other
 ---

+---
+language:
+- hi
+- gu
+- pa
+- as
+- ta
+- mr
+- bn
+- te
+- ml
+- kn
+---
+Indic-Sentence-Completion
 ---
 license: other
 ---
+# Details
+The model cannot be commercially used. It's a fine-tuned Bloom-3B in several Indian languages:
+- Gujarati
+- Marathi
+- Bangali
+- Punjabi
+- Kannada
+- Malayalam
+- Telugu
+- Tamil
+- Hindi
+# Architecture
+Same as Bloom-3B, the model is decoder only.
+# Motivation behind the model fine-tuning
+- The model can be fine-tuned for any downstream task that requires the use of the aforementioned Indian languages
+- PEFT LoRA is advised.
+- Can be stacked with an Encoder if needed for any Sequence to Sequence task that requires aforementioned Indian languages
+# Example of getting inference from the model
+    from transformers import AutoModel, AutoConfig, AutoModelForCausalLM, AutoTokenizer
+    # Path to the directory containing the model files
+    model_directory = "autopilot-ai/Indic-sentence-completion"
+    tokenizer = AutoTokenizer.from_pretrained(model_directory)
+    model = AutoModelForCausalLM.from_pretrained(
+        model_directory,
+        load_in_8bit=True,
+        device_map="auto",
+    )
+    # Load the model configuration
+    config = AutoConfig.from_pretrained(model_directory)
+    # Load the model
+    model = AutoModel.from_pretrained(model_directory, config=config)
+    batch = tokenizer("હેલો કેમ છો?", return_tensors='pt')
+    with torch.cuda.amp.autocast():
+       output_tokens = model.generate(**batch, max_new_tokens=10)
+    print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))