File size: 818 Bytes
6b85421 eff086c 6b85421 eff086c 6b85421 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# Model description
- Morphosyntactic analyzer: Trankit
- Tagset: UD
- Embedding vectors: XLM-RoBERTa-Base
- Dataset: NLPrePL-NKJP-fair-by-name (https://huggingface.co/datasets/ipipan/nlprepl)
# How to use
## Clone
```
git clone [email protected]:ipipan/nlpre_trankit_ud_xlm-roberta-base_nkjp-by-name
```
## Load model
```
import trankit
model_path = './nlpre_trankit_ud_xlm-roberta-base_nkjp-by-name'
trankit.verify_customized_pipeline(
category='customized-mwt', # pipeline category
save_dir=model_path, # directory used for saving models in previous steps
embedding_name='xlm-roberta-base' # embedding version that we use for training our customized pipeline, by default, it is `xlm-roberta-base`
)
model = trankit.Pipeline(lang='customized-mwt', cache_dir=model_path, embedding='xlm-roberta-base')
``` |