martynawck commited on
Commit
1285fcd
1 Parent(s): 7f9c385

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model description
2
+
3
+ - Morphosyntactic analyzer: Trankit
4
+ - Tagset: UD
5
+ - Embedding vectors: XLM-RoBERTa-Large
6
+ - Dataset: NLPrePL-NKJP-fair-by-name (https://huggingface.co/datasets/ipipan/nlprepl)
7
+
8
+ # How to use
9
+
10
+ ## Clone
11
+
12
+ ```
13
+ git clone [email protected]:ipipan/nlpre_trankit_ud_xlm-roberta-large_nkjp-by-name
14
+ ```
15
+
16
+ ## Load model
17
+
18
+ ```
19
+ import trankit
20
+
21
+ model_path = './nlpre_trankit_ud_xlm-roberta-large_nkjp-by-name'
22
+
23
+ trankit.verify_customized_pipeline(
24
+ category='customized-mwt', # pipeline category
25
+ save_dir=model_path, # directory used for saving models in previous steps
26
+ embedding_name='xlm-roberta-large' # embedding version that we use for training our customized pipeline, by default, it is `xlm-roberta-base`
27
+ )
28
+
29
+ model = trankit.Pipeline(lang='customized-mwt', cache_dir=model_path, embedding='xlm-roberta-large')
30
+ ```