File size: 3,163 Bytes
1c8bb40 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 3c8695a 84e9ab9 1c8bb40 84e9ab9 3c8695a 84e9ab9 1c8bb40 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
---
tags:
- translation
- causal-lm
- text-generation
- huggingface
- phi-3
- lora
- tamil
license: apache-2.0
datasets:
- aryaumesh/english-to-tamil
model-index:
- name: shangeth/phi3-mini-ta_en
results:
- task:
type: translation
dataset:
name: aryaumesh/english-to-tamil
type: text
metrics:
- name: BLEU
type: bleu
value: TBD
---
# Model Card: Phi-3 Mini Tamil-English Translator
## Model Overview
**Model Name:** shangeth/phi3-mini-ta_en
**Base Model:** microsoft/Phi-3-mini-128k-instruct
**Fine-tuned On:** [aryaumesh/english-to-tamil](https://huggingface.co/datasets/aryaumesh/english-to-tamil)
**Quantization:** 4-bit (LoRA fine-tuned)
**Task:** English-to-Tamil and Tamil-to-English translation
## Model Description
This model is a fine-tuned version of `microsoft/Phi-3-mini-128k-instruct`, optimized for bidirectional translation between English and Tamil. The model has been fine-tuned using Low-Rank Adaptation (LoRA) with 4-bit quantization, enabling efficient inference on resource-constrained devices.
## Training Details
- **Dataset Used:** `aryaumesh/english-to-tamil`
- **Training Methodology:**
- LoRA fine-tuning on bidirectional translation pairs
- EOS token appended to training data
- Mixed-precision training (bfloat16)
- **Training Hardware:** NVIDIA A100 GPU (4-bit quantization enabled)
- **Checkpoints:** Saved at regular intervals and final merged model uploaded
## How to Use the Model
### Inference Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "shangeth/phi3-mini-ta_en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
def translate_text(input_text, target_language="Tamil"):
prompt = f"Translate the following sentence to {target_language}: {input_text}\nTranslated Sentence:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_length=512, num_beams=5, early_stopping=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
input_sentence = "Hello, how are you?"
translated_sentence = translate_text(input_sentence, target_language="Tamil")
print("Translated Sentence:", translated_sentence)
```
## Performance
- **BLEU Score (English to Tamil):** TBD
- **BLEU Score (Tamil to English):** TBD
## Limitations
- May struggle with domain-specific terminology.
- Potential biases in translations due to dataset limitations.
- Accuracy can be improved with further fine-tuning on specialized datasets.
## Citation
If you use this model in your research or application, please cite:
```bibtex
@misc{shangeth_phi3_mini_ta_en,
author = {Shangeth Rajaa},
title = {Phi-3 Mini Tamil-English Translator},
year = {2024},
url = {https://huggingface.co/shangeth/phi3-mini-ta_en}
}
```
## Contact
For questions or contributions, feel free to reach out via the [Hugging Face discussions](https://huggingface.co/shangeth/phi3-mini-ta_en/discussions).
|