File size: 2,094 Bytes
c078932 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
language:
- cs
- cs
tags:
- abstractive summarization
- mbart-cc25
- Czech
license: apache-2.0
datasets:
- SumeCzech dataset news-based
metrics:
- rouge
- rougeraw
---
# mBART fine-tuned model for Czech abstractive summarization (AT2H-S)
This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the Czech news dataset to produce Czech abstractive summaries.
## Task
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.
## Dataset
The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens.
## Training
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours. During training, the model has seen 2576K documents corresponding to roughly 3 epochs.
# Use
Assuming you are using the provided Summarizer.ipynb file.
```python
def summ_config():
cfg = OrderedDict([
# summarization model - checkpoint from website
("model_name", "krotima1/mbart-at2h-s"),
("inference_cfg", OrderedDict([
("num_beams", 4),
("top_k", 40),
("top_p", 0.92),
("do_sample", True),
("temperature", 0.89),
("repetition_penalty", 1.2),
("no_repeat_ngram_size", None),
("early_stopping", True),
("max_length", 96),
("min_length", 10),
])),
#texts to summarize
("text",
[
"Input your Czech text",
]
),
])
return cfg
cfg = summ_config()
#load model
model = AutoModelForSeq2SeqLM.from_pretrained(cfg["model_name"])
tokenizer = AutoTokenizer.from_pretrained(cfg["model_name"])
# init summarizer
summarize = Summarizer(model, tokenizer, cfg["inference_cfg"])
summarize(cfg["text"])
``` |