krotima1
/

mbart-ht2a-s

Text2Text Generation

abstractive summarization

Inference Endpoints

Model card Files Files and versions Community

Marian Krotil commited on May 23, 2022

Commit

249be7e

•

1 Parent(s): c9e40e3

Update README.md

Files changed (1) hide show

README.md +62 -2

README.md CHANGED Viewed

@@ -1,2 +1,62 @@
-# mBART fine-tuned model for Czech Summarization
--

+---
+language:
+- cs
+- cs
+tags:
+- abstractive summarization
+- mbart-cc25
+- Czech
+license: apache-2.0
+datasets:
+- SumeCzech dataset news-based
+metrics:
+- rouge
+- rougeraw
+---
+# mBART fine-tuned model for Czech abstractive summarization (HT2A-S)
+This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://huggingface.co/facebook/mbart-large-cc25) on the Czech news dataset to produce Czech abstractive summaries.
+## Task
+The model deals with the task ``Headline + Text to Abstract`` (HT2A) which consists in generating a multi-sentence summary considered as an abstract from a Czech news text.
+## Dataset
+The model has been trained on the [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were configured for 512 tokens.
+## Training
+The model has been trained on 1x NVIDIA Tesla A100 40GB for 20 hours, 1x NVIDIA Tesla V100 32GB for 40 hours, and 4x NVIDIA Tesla A100 40GB for 20 hours. During training, the model has seen 6928K documents corresponding to roughly 8 epochs.
+# Use
+Assuming you are using the provided Summarizer.ipynb file.
+```python
+def summ_config():
+    cfg = OrderedDict([
+        # summarization model - checkpoint from website
+        ("model_name", "krotima1/mbart-ht2a-s"),
+        ("inference_cfg", OrderedDict([
+            ("num_beams", 4),
+            ("top_k", 40),
+            ("top_p", 0.92),
+            ("do_sample", True),
+            ("temperature", 0.89),
+            ("repetition_penalty", 1.2),
+            ("no_repeat_ngram_size", None),
+            ("early_stopping", True),
+            ("max_length", 96),
+            ("min_length", 10),
+        ])),
+        #texts to summarize
+        ("text",
+            [
+                "Input your Czech text",
+            ]
+        ),
+    ])
+    return cfg
+cfg = summ_config()
+#load model
+model = AutoModelForSeq2SeqLM.from_pretrained(cfg["model_name"])
+tokenizer = AutoTokenizer.from_pretrained(cfg["model_name"])
+# init summarizer
+summarize = Summarizer(model, tokenizer, cfg["inference_cfg"])
+summarize(cfg["text"])
+```