flan-t5-base-tldr_news
A fine-tuned T5 model for text summarization and title generation on TLDR (Too Long; Didn't Read) news articles.
Introduction
flan-t5-base-tldr_news is a deep learning model that has been fine-tuned on a dataset of TLDR news articles. The model is specifically designed to perform the tasks of text summarization and title generation.
The T5 architecture is a transformer-based neural network architecture that has been used to achieve state-of-the-art results on a variety of NLP tasks. By fine-tuning the T5 architecture on a dataset of TLDR news articles, we aim to create a model that is capable of generating concise and informative summaries and titles for news articles.
Task
The main goal of this model is to perform two NLP tasks: text summarization and title generation. Text summarization involves generating a shortened version of a longer text that retains the most important information and ideas. Title generation, on the other hand, involves generating a headline or title for a given text that accurately and concisely captures the main theme or idea of the text.
Architecture
flan-t5-base-tldr_news uses the T5 architecture, which has been shown to be effective for a variety of NLP tasks. The T5 architecture consists of an encoder and a decoder, which are trained to generate a summary or title given an input text.
Model Size
The model has 247,577,856 parameters, which represents the number of tunable weights in the model. The size of the model can impact the speed and memory requirements during training and inference, as well as the performance of the model on specific tasks.
Training Data
The model was fine-tuned on a dataset of TLDR news articles. This dataset was selected because it contains a large number of news articles that have been condensed into short summaries, making it a good choice for training a model for text summarization. The training data was preprocessed to perform all types of standard preprocessing steps, including tokenization, to prepare the data for input into the model.
Evaluation Metrics
To evaluate the performance of the model on the tasks of text summarization and title generation, we used the ROUGE metric. ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, measures the overlap between the generated text and the reference text, which in this case is the original news article or its summary. The ROUGE metric is commonly used in NLP evaluations and provides a good way to measure the quality of the generated summaries and titles.
The following table shows the ROUGE scores for the model on the test set, which provides a good indication of its overall performance on the text summarization and title generation tasks:
Metric | Score |
---|---|
Rouge1 | 45.04 |
Rouge2 | 25.24 |
RougeL | 41.89 |
RougeIsum | 41.84 |
It's important to note that these scores are just a snapshot of the model's performance on a specific test set, and the performance of the model may vary depending on the input text, the quality of the training data, and the specific application for which the model is being used.
How to use via API
from transformers import pipeline
summarizer = pipeline(
'summarization',
'ybagoury/flan-t5-base-tldr_news',
)
raw_text = """ your text here... """
results = summarizer(raw_text)
print(results)
- Downloads last month
- 21