Spaces:

ipipan
/

nlprepl

Running

martynawck commited on Oct 6, 2023

Commit

01a7a32

•

1 Parent(s): d1096b6

Update index.html

Files changed (1) hide show

index.html CHANGED Viewed

@@ -13,7 +13,7 @@
   <hr>
   <h2 style="text-align: center;">NLPre-PL Dataset</h2>
     <hr>
-  <p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1-million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Jezyka Polskiego).
   </p>
   <p>
     The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time.

   <hr>
   <h2 style="text-align: center;">NLPre-PL Dataset</h2>
     <hr>
+  <p>The official NLPre-PL dataset - a uniformly paragraph-level divided version of NKJP1M corpus – the 1 million token balanced subcorpus of the National Corpus of Polish (Narodowy Korpus Jezyka Polskiego).
   </p>
   <p>
     The NLPre dataset aims at fairly dividing the paragraphs length-wise and topic-wise into train, development, and test sets. Thus, we ensure a similar number of segments distribution per paragraph and avoid the situation when paragraphs with a small (or large) number of segments are available only e.g. during test time.