UrduParaphraseBERT / README.md
mwz's picture
Update README.md
e6ef34d
|
raw
history blame
1.92 kB
---
license: mit
datasets:
- mwz/ur_para
language:
- ur
tags:
- 'paraphrase '
pipeline_tag: text2text-generation
---
# Urdu Paraphrasing Model
This repository contains a pretrained model for Urdu paraphrasing. The model is based on the BERT architecture and has been fine-tuned on a large dataset of Urdu paraphrases.
## Model Description
The pretrained model is based on the BERT architecture, specifically designed for paraphrasing tasks in the Urdu language. It has been trained using a large corpus of Urdu text to generate high-quality paraphrases.
## Model Details
- Model Name: Urdu-Paraphrasing-BERT
- Base Model: BERT
- Architecture: Transformer
- Language: Urdu
- Dataset: Urdu Paraphrasing Dataset mwz/ur_para
## How to Use
You can use this pretrained model for generating paraphrases for Urdu text. Here's an example of how to use the model:
```python
from transformers import pipeline
# Load the model
model = pipeline("text2text-generation", model="path_to_pretrained_model")
# Generate paraphrases
input_text = "Urdu input text for paraphrasing."
paraphrases = model(input_text, max_length=128, num_return_sequences=3)
# Print the generated paraphrases
print("Original Input Text:", input_text)
print("Generated Paraphrases:")
for paraphrase in paraphrases:
print(paraphrase["generated_text"])
```
## Training
The model was trained using the Hugging Face transformers library. The training process involved fine-tuning the base BERT model on the Urdu Paraphrasing Dataset.
## Evaluation
The model's performance was evaluated on a separate validation set using metrics such as BLEU, ROUGE, and perplexity. However, please note that the evaluation results may vary depending on the specific use case.
## Acknowledgments
- The pretrained model is based on the BERT architecture developed by Google Research.
## License
This model and the associated code are licensed under the MIT License.