Model Card for t5_small Summarization Model
Model Details
- Model Architecture: T5 (Text-to-Text Transfer Transformer)
- Variant: t5-small
- Task: Text Summarization
- Framework: Hugging Face Transformers
Training Data
- Dataset: CNN/DailyMail
- Content: News articles and their summaries
- Size: Approximately 300,000 article-summary pairs
Training Procedure
- Fine-tuning method: Using Hugging Face Transformers library
- Hyperparameters:
- Learning rate: 5e-5
- Batch size: 8
- Number of epochs: 3
- Optimizer: AdamW
How to Use
- Install the Hugging Face Transformers library:
pip install transformers
- Load the model:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
- Generate a summary:
input_text = "Your input text here"
inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
Evaluation
- Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
- Exact scores not available, but typically evaluated on:
- ROUGE-1 (unigram overlap)
- ROUGE-2 (bigram overlap)
- ROUGE-L (longest common subsequence)
Limitations
- Performance may be lower compared to larger T5 variants
- Optimized for news article summarization, may not perform as well on other text types
- Limited to input sequences of 512 tokens
- Generated summaries may sometimes contain factual inaccuracies
Ethical Considerations
- May inherit biases present in the CNN/DailyMail dataset
- Not suitable for summarizing sensitive or critical information without human review
- Users should be aware of potential biases and inaccuracies in generated summaries
- Should not be used as a sole source of information for decision-making processes
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for privetin/model-1
Base model
google-t5/t5-small