qanastek commited on
Commit
9f3970e
1 Parent(s): 6a13a62
Files changed (2) hide show
  1. README.md +6 -13
  2. preview.PNG +0 -0
README.md CHANGED
@@ -48,18 +48,11 @@ print(sentence.to_tagged_string())
48
 
49
  Output:
50
 
51
- ```bash
52
- George <PROPN>
53
- Washington <XFAMIL>
54
- est <AUX>
55
- allé <VPPMS>
56
- à <PREP>
57
- Washington <PROPN>
58
- ```
59
 
60
- ## Corpora
61
 
62
- `UD_FRENCH_GSD_PLUS` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
63
 
64
  Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add semantic information such as the gender, number, mood, person, tense or verb form given in the different CoNLL-03 fields from the original corpora.
65
 
@@ -73,7 +66,7 @@ PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
73
 
74
  ## New Tags
75
 
76
- | Tag | Full Name | Examples |
77
  |:--------:|:--------:|:--------:|
78
  | PREP | Preposition | de |
79
  | AUX | Auxiliary Verb | est |
@@ -139,7 +132,7 @@ PRON VERB SCONJ ADP CCONJ DET NOUN ADJ AUX ADV PUNCT PROPN NUM SYM PART X INTJ
139
  | MOTINC | Unknown words | Technology Lady |
140
  | X | Typos & others | sfeir 3D statu |
141
 
142
- ## Evaluation Metrics
143
 
144
  ```plain
145
  Results:
@@ -215,7 +208,7 @@ By class:
215
  weighted avg 0.9524 0.9520 0.9515 10019
216
  ```
217
 
218
- ## Citations
219
 
220
  Please cite the following paper when using this model.
221
 
 
48
 
49
  Output:
50
 
51
+ ![Preview Output](preview.PNG)
 
 
 
 
 
 
 
52
 
53
+ ## Training data
54
 
55
+ `UD_FRENCH_GSD_Plus` is a part-of-speech tagging corpora based on [UD_French-GSD](https://universaldependencies.org/treebanks/fr_gsd/index.html) which was originally created in 2015 and is based on the [universal dependency treebank v2.0](https://github.com/ryanmcd/uni-dep-tb).
56
 
57
  Originally, the corpora consists of 400,399 words (16,341 sentences) and had 17 different classes. Now, after applying our tags augmentation we obtain 60 different classes which add semantic information such as the gender, number, mood, person, tense or verb form given in the different CoNLL-03 fields from the original corpora.
58
 
 
66
 
67
  ## New Tags
68
 
69
+ | Abbreviation | Description | Examples |
70
  |:--------:|:--------:|:--------:|
71
  | PREP | Preposition | de |
72
  | AUX | Auxiliary Verb | est |
 
132
  | MOTINC | Unknown words | Technology Lady |
133
  | X | Typos & others | sfeir 3D statu |
134
 
135
+ ## Evaluation results
136
 
137
  ```plain
138
  Results:
 
208
  weighted avg 0.9524 0.9520 0.9515 10019
209
  ```
210
 
211
+ ## BibTeX Citations
212
 
213
  Please cite the following paper when using this model.
214
 
preview.PNG ADDED