My Fine-Tuned T5-Small for Article & News Summarization

Description

This model is a fine-tuned version of the T5-small model for article and news summarization. It has been trained on the CNN/Dailymail dataset to generate concise summaries of news articles.

How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("d0p3/t5-small-dailycnn")
model = AutoModelForSeq2SeqLM.from_pretrained("d0p3/t5-small-dailycnn")

text = """
(Your long article text to summarize goes here.)
"""

inputs = tokenizer("summarize: " + text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs["input_ids"], num_beams=4, max_length=128)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print(summary)

Training Details

  • Dataset: CNN/Dailymail (version 3.0.0)
  • Base Model: T5-small
  • Learning Rate: 2e-5
  • Batch Size: 4
  • Epochs: 3
  • Optimizer: AdamW with Weight Decay (0.01)
  • Hardware: 1 x RTX 4090
  • Framework: PyTorch

Limitations

  • This model may not perform well on article styles significantly different from the CNN/Dailymail dataset.
  • As with many language models, it may potentially reproduce biases or inaccuracies present in the training data.

Ethical Considerations

Please use this model responsibly. Consider how the generated summaries may inadvertently perpetuate harmful stereotypes or misinformation.

Contact

Feel free to leave feedback or issues on this Hugging Face repository.

Key Points:

  • Clear Structure: Use headings and sections to improve readability.
  • Details: Provide specifics about the fine-tuning process.
  • Disclaimers: Highlight limitations and encourage responsible use.

Let me know if you'd like any modifications or additions to tailor this README further!

Downloads last month
44
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train d0p3/t5-small-dailycnn