File size: 3,820 Bytes
1b13fc3 4a90a76 88e3c2e 4a90a76 88e3c2e 4a90a76 0f38ed2 4a90a76 88e3c2e 4a90a76 1b13fc3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
---
license: apache-2.0
language:
- en
base_model:
- google-t5/t5-base
datasets:
- abhinavsarkar/C4-200m-550k-Determiner
library_name: transformers
---
---
# Model Card for Google-T5-base-Grammatical-Error-Correction-Finetuned-C4-200M-550k
This model is fine-tuned for grammatical error correction (GEC). It helps in generating grammatically correct text from input sentences with diverse types of errors, making it useful for applications in writing enhancement and grammar correction across various domains.
## Model Details
### Model Description
This model is a fine-tuned version of [Google-T5-base] aimed at providing high-quality prompt generation across diverse topics.
It excels in understanding input instructions and generating structured prompt that fit various creative, professional, and instructional needs.
- **Developed by:** Abhinav Sarkar
- **Shared by:** abhinavsarkar
- **Model type:** Causal Language Model
- **Languages:** English
- **Finetuned from model:** Google-T5-base
## Uses
### Direct Use
This model is suitable for grammar and language correction tools, enhancing writing quality in emails, blogs, social media posts, and more.
It is particularly helpful for users seeking to improve their English language grammar and accuracy in various communication formats.
### Downstream Use
The model can be integrated into systems that require high-quality text generation and correction, such as:
- Grammar and spell-checking software
- Educational platforms for language learning
- Writing assistance tools for professionals
## How to Get Started with the Model
Use the following peices of codes to start using the model:
- PreRequisites
```python
!pip install -U sentencepiece transformers torch
```
- Loading the model and its tokenizer
```python
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration
model_name = 'abhinavsarkar/Google-T5-base-Grammatical_Error_Correction-Finetuned-C4-200M-550k'
torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name).to(torch_device)
```
- Inferencing the model
```python
import torch
def correct_grammar(input_text,num_return_sequences):
batch = tokenizer([input_text],truncation=True,padding='max_length',max_length=64, return_tensors="pt").to(torch_device)
translated = model.generate(**batch,max_length=64,num_beams=4, num_return_sequences=num_return_sequences, temperature=1.5)
tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
return tgt_text
text = 'He are moving here.'
print(correct_grammar(text, num_return_sequences=2))
```
## Training Details
### Training Data
The model was fine-tuned on [abhinavsarkar/C4-200m-550k-Determiner], a subset of C4-200M dataset[https://www.kaggle.com/datasets/felixstahlberg/the-c4-200m-dataset-for-gec] focused on grammatical error correction (GEC) with 200 million examples containing diverse syntactic and semantic structures.
### Training Procedure
The model was fine-tuned using the Hugging Face Transformers library, Wandb in a distributed environment(Google Collab).
#### Training Hyperparameters
- **Training regime:** fp16 mixed precision
- **Epochs:** 2
- **Batch size:** 16
- **Learning rate:** 2e-4
## Technical Specifications
### Compute Infrastructure
#### Hardware
The fine-tuning was conducted on a setup involving a single T4 GPUs.
#### Software
- **Framework**: PyTorch
- **Libraries**: Hugging Face Transformers
## More Information
For further details or inquiries, please reach out via [LinkedIn](https://www.linkedin.com/in/abhinavsarkarrr/) or email at [email protected].
## Model Card Authors
- Abhinav Sarkar
## Model Card Contact
- [email protected]
--- |