phi3-mini-ta_en / README.md
shangeth's picture
Update README.md
1c8bb40 verified
metadata
tags:
  - translation
  - causal-lm
  - text-generation
  - huggingface
  - phi-3
  - lora
  - tamil
license: apache-2.0
datasets:
  - aryaumesh/english-to-tamil
model-index:
  - name: shangeth/phi3-mini-ta_en
    results:
      - task:
          type: translation
        dataset:
          name: aryaumesh/english-to-tamil
          type: text
        metrics:
          - name: BLEU
            type: bleu
            value: TBD

Model Card: Phi-3 Mini Tamil-English Translator

Model Overview

Model Name: shangeth/phi3-mini-ta_en
Base Model: microsoft/Phi-3-mini-128k-instruct
Fine-tuned On: aryaumesh/english-to-tamil
Quantization: 4-bit (LoRA fine-tuned)
Task: English-to-Tamil and Tamil-to-English translation

Model Description

This model is a fine-tuned version of microsoft/Phi-3-mini-128k-instruct, optimized for bidirectional translation between English and Tamil. The model has been fine-tuned using Low-Rank Adaptation (LoRA) with 4-bit quantization, enabling efficient inference on resource-constrained devices.

Training Details

  • Dataset Used: aryaumesh/english-to-tamil
  • Training Methodology:
    • LoRA fine-tuning on bidirectional translation pairs
    • EOS token appended to training data
    • Mixed-precision training (bfloat16)
  • Training Hardware: NVIDIA A100 GPU (4-bit quantization enabled)
  • Checkpoints: Saved at regular intervals and final merged model uploaded

How to Use the Model

Inference Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "shangeth/phi3-mini-ta_en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()

def translate_text(input_text, target_language="Tamil"):
    prompt = f"Translate the following sentence to {target_language}: {input_text}\nTranslated Sentence:"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_length=512, num_beams=5, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

input_sentence = "Hello, how are you?"
translated_sentence = translate_text(input_sentence, target_language="Tamil")
print("Translated Sentence:", translated_sentence)

Performance

  • BLEU Score (English to Tamil): TBD
  • BLEU Score (Tamil to English): TBD

Limitations

  • May struggle with domain-specific terminology.
  • Potential biases in translations due to dataset limitations.
  • Accuracy can be improved with further fine-tuning on specialized datasets.

Citation

If you use this model in your research or application, please cite:

@misc{shangeth_phi3_mini_ta_en,
  author = {Shangeth Rajaa},
  title = {Phi-3 Mini Tamil-English Translator},
  year = {2024},
  url = {https://huggingface.co/shangeth/phi3-mini-ta_en}
}

Contact

For questions or contributions, feel free to reach out via the Hugging Face discussions.