Marian Krotil
commited on
Commit
•
15d3c4e
1
Parent(s):
b4d41f3
Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ This model is a fine-tuned checkpoint of [facebook/mbart-large-cc25](https://hug
|
|
21 |
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.
|
22 |
|
23 |
## Dataset
|
24 |
-
The model has been trained on a large Czech news dataset developed by a concatenation of two datasets, the private CNC dataset provided by Czech News Center and [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1.75M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were set to 512 tokens.
|
25 |
|
26 |
## Training
|
27 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours, 1x NVIDIA Tesla V100 32GB for 20 hours, and 4x NVIDIA Tesla A100 40GB for 20 hours. During training, the model has seen 7936K documents corresponding to roughly 5 epochs.
|
|
|
21 |
The model deals with the task ``Abstract + Text to Headline`` (AT2H) which consists in generating a one- or two-sentence summary considered as a headline from a Czech news text.
|
22 |
|
23 |
## Dataset
|
24 |
+
The model has been trained on a large Czech news dataset developed by a concatenation of two datasets, the private CNC dataset provided by Czech News Center and [SumeCzech](https://ufal.mff.cuni.cz/sumeczech) dataset. The dataset includes around 1.75M Czech news-based documents consisting of a Headline, Abstract, and Full-text sections. Truncation and padding were set to 512 tokens for the encoder and 64 for the decoder.
|
25 |
|
26 |
## Training
|
27 |
The model has been trained on 1x NVIDIA Tesla A100 40GB for 40 hours, 1x NVIDIA Tesla V100 32GB for 20 hours, and 4x NVIDIA Tesla A100 40GB for 20 hours. During training, the model has seen 7936K documents corresponding to roughly 5 epochs.
|