Hugging Face: Banglish to Bangla Translation
This repository demonstrates how to use a Hugging Face model to translate Banglish (Romanized Bangla) text into Bangla using the MBart50 tokenizer and model. The model, Mdkaif2782/banglish-to-bangla
, is pre-trained and fine-tuned for this task.
Setup in Google Colab
Follow these steps to use the model in Google Colab:
1. Install Dependencies
Make sure you have the transformers
library installed. Run the following command in your Colab notebook:
!pip install transformers torch
2. Load and Use the Model
Copy the code below into a cell in your Colab notebook to start translating Banglish to Bangla:
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
import torch
# Load the pre-trained model and tokenizer directly from Hugging Face
model_name = "Mdkaif2782/banglish-to-bangla"
tokenizer = MBart50TokenizerFast.from_pretrained(model_name)
model = MBartForConditionalGeneration.from_pretrained(model_name)
def translate_banglish_to_bangla(model, tokenizer, banglish_input):
inputs = tokenizer(banglish_input, return_tensors="pt", padding=True, truncation=True, max_length=128)
if torch.cuda.is_available():
inputs = {key: value.cuda() for key, value in inputs.items()}
model = model.cuda()
translated_tokens = model.generate(**inputs, decoder_start_token_id=tokenizer.lang_code_to_id["bn_IN"])
translated_text = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
return translated_text
# Take custom input
print("Enter your Banglish text (type 'exit' to quit):")
while True:
banglish_text = input("Banglish: ")
if banglish_text.lower() == "exit":
break
# Translate Banglish to Bangla
translated_text = translate_banglish_to_bangla(model, tokenizer, banglish_text)
print(f"Translated Bangla: {translated_text}\n")
3. Run the Notebook
- Paste the above code into a cell.
- Run the cell.
- Enter your Banglish text in the input prompt to get the translated Bangla text. Type
exit
to quit.
Example Usage
Input:
Banglish: amar valo lagche onek
Output:
Translated Bangla: আমার ভালো লাগছে অনেক
Notes
- Ensure your runtime in Google Colab supports GPU for faster processing. Go to
Runtime > Change runtime type
and selectGPU
. - The model
Mdkaif2782/banglish-to-bangla
can be fine-tuned further if required.
License
This project uses the Hugging Face transformers
library. Refer to the Hugging Face documentation for more details.
- Downloads last month
- 59
Model tree for Mdkaif2782/banglish-to-bangla
Base model
facebook/mbart-large-50