howanching-clara commited on
Commit
a0c74e3
1 Parent(s): 14af5bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -31,8 +31,7 @@ It achieves the following results on the evaluation set:
31
  ## Model description
32
 
33
  The model is fine-tuned with academic publications in Linguistics, to classify texts in publications into 4 classes as a filter to other tasks.
34
-
35
- The 4 classes:
36
  - 0: out of scope - materials that are of low significance, eg. page number and page header, noise from OCR/pdf-to-text convertion
37
  - 1: main text - texts that are the main texts of the publication, to be used for down-stream tasks
38
  - 2: examples - texts that are captions of the figures, or quotes or excerpts
 
31
  ## Model description
32
 
33
  The model is fine-tuned with academic publications in Linguistics, to classify texts in publications into 4 classes as a filter to other tasks.
34
+ Sentence-based data obtained from OCR-processed PDF files was annotated manually with the following classes:
 
35
  - 0: out of scope - materials that are of low significance, eg. page number and page header, noise from OCR/pdf-to-text convertion
36
  - 1: main text - texts that are the main texts of the publication, to be used for down-stream tasks
37
  - 2: examples - texts that are captions of the figures, or quotes or excerpts