|
--- |
|
tags: |
|
- translation |
|
- causal-lm |
|
- text-generation |
|
- huggingface |
|
- phi-3 |
|
- lora |
|
- tamil |
|
license: apache-2.0 |
|
datasets: |
|
- aryaumesh/english-to-tamil |
|
model-index: |
|
- name: shangeth/phi3-mini-ta_en |
|
results: |
|
- task: |
|
type: translation |
|
dataset: |
|
name: aryaumesh/english-to-tamil |
|
type: text |
|
metrics: |
|
- name: BLEU |
|
type: bleu |
|
value: TBD |
|
--- |
|
|
|
# Model Card: Phi-3 Mini Tamil-English Translator |
|
|
|
## Model Overview |
|
|
|
**Model Name:** shangeth/phi3-mini-ta_en |
|
**Base Model:** microsoft/Phi-3-mini-128k-instruct |
|
**Fine-tuned On:** [aryaumesh/english-to-tamil](https://huggingface.co/datasets/aryaumesh/english-to-tamil) |
|
**Quantization:** 4-bit (LoRA fine-tuned) |
|
**Task:** English-to-Tamil and Tamil-to-English translation |
|
|
|
## Model Description |
|
|
|
This model is a fine-tuned version of `microsoft/Phi-3-mini-128k-instruct`, optimized for bidirectional translation between English and Tamil. The model has been fine-tuned using Low-Rank Adaptation (LoRA) with 4-bit quantization, enabling efficient inference on resource-constrained devices. |
|
|
|
## Training Details |
|
|
|
- **Dataset Used:** `aryaumesh/english-to-tamil` |
|
- **Training Methodology:** |
|
- LoRA fine-tuning on bidirectional translation pairs |
|
- EOS token appended to training data |
|
- Mixed-precision training (bfloat16) |
|
- **Training Hardware:** NVIDIA A100 GPU (4-bit quantization enabled) |
|
- **Checkpoints:** Saved at regular intervals and final merged model uploaded |
|
|
|
## How to Use the Model |
|
|
|
### Inference Example |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
model_name = "shangeth/phi3-mini-ta_en" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
model.eval() |
|
|
|
def translate_text(input_text, target_language="Tamil"): |
|
prompt = f"Translate the following sentence to {target_language}: {input_text}\nTranslated Sentence:" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
with torch.no_grad(): |
|
outputs = model.generate(**inputs, max_length=512, num_beams=5, early_stopping=True) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
input_sentence = "Hello, how are you?" |
|
translated_sentence = translate_text(input_sentence, target_language="Tamil") |
|
print("Translated Sentence:", translated_sentence) |
|
``` |
|
|
|
## Performance |
|
|
|
- **BLEU Score (English to Tamil):** TBD |
|
- **BLEU Score (Tamil to English):** TBD |
|
|
|
## Limitations |
|
|
|
- May struggle with domain-specific terminology. |
|
- Potential biases in translations due to dataset limitations. |
|
- Accuracy can be improved with further fine-tuning on specialized datasets. |
|
|
|
## Citation |
|
If you use this model in your research or application, please cite: |
|
|
|
```bibtex |
|
@misc{shangeth_phi3_mini_ta_en, |
|
author = {Shangeth Rajaa}, |
|
title = {Phi-3 Mini Tamil-English Translator}, |
|
year = {2024}, |
|
url = {https://huggingface.co/shangeth/phi3-mini-ta_en} |
|
} |
|
``` |
|
|
|
## Contact |
|
For questions or contributions, feel free to reach out via the [Hugging Face discussions](https://huggingface.co/shangeth/phi3-mini-ta_en/discussions). |
|
|