phi3-mini-ta_en / README.md
shangeth's picture
Update README.md
1c8bb40 verified
---
tags:
- translation
- causal-lm
- text-generation
- huggingface
- phi-3
- lora
- tamil
license: apache-2.0
datasets:
- aryaumesh/english-to-tamil
model-index:
- name: shangeth/phi3-mini-ta_en
results:
- task:
type: translation
dataset:
name: aryaumesh/english-to-tamil
type: text
metrics:
- name: BLEU
type: bleu
value: TBD
---
# Model Card: Phi-3 Mini Tamil-English Translator
## Model Overview
**Model Name:** shangeth/phi3-mini-ta_en
**Base Model:** microsoft/Phi-3-mini-128k-instruct
**Fine-tuned On:** [aryaumesh/english-to-tamil](https://huggingface.co/datasets/aryaumesh/english-to-tamil)
**Quantization:** 4-bit (LoRA fine-tuned)
**Task:** English-to-Tamil and Tamil-to-English translation
## Model Description
This model is a fine-tuned version of `microsoft/Phi-3-mini-128k-instruct`, optimized for bidirectional translation between English and Tamil. The model has been fine-tuned using Low-Rank Adaptation (LoRA) with 4-bit quantization, enabling efficient inference on resource-constrained devices.
## Training Details
- **Dataset Used:** `aryaumesh/english-to-tamil`
- **Training Methodology:**
- LoRA fine-tuning on bidirectional translation pairs
- EOS token appended to training data
- Mixed-precision training (bfloat16)
- **Training Hardware:** NVIDIA A100 GPU (4-bit quantization enabled)
- **Checkpoints:** Saved at regular intervals and final merged model uploaded
## How to Use the Model
### Inference Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "shangeth/phi3-mini-ta_en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
def translate_text(input_text, target_language="Tamil"):
prompt = f"Translate the following sentence to {target_language}: {input_text}\nTranslated Sentence:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_length=512, num_beams=5, early_stopping=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
input_sentence = "Hello, how are you?"
translated_sentence = translate_text(input_sentence, target_language="Tamil")
print("Translated Sentence:", translated_sentence)
```
## Performance
- **BLEU Score (English to Tamil):** TBD
- **BLEU Score (Tamil to English):** TBD
## Limitations
- May struggle with domain-specific terminology.
- Potential biases in translations due to dataset limitations.
- Accuracy can be improved with further fine-tuning on specialized datasets.
## Citation
If you use this model in your research or application, please cite:
```bibtex
@misc{shangeth_phi3_mini_ta_en,
author = {Shangeth Rajaa},
title = {Phi-3 Mini Tamil-English Translator},
year = {2024},
url = {https://huggingface.co/shangeth/phi3-mini-ta_en}
}
```
## Contact
For questions or contributions, feel free to reach out via the [Hugging Face discussions](https://huggingface.co/shangeth/phi3-mini-ta_en/discussions).