This is a Spacy multilingual (Catalan & Spanish) anonimization model, for use with BSC's AnonymizationPipeline at:
https://github.com/TeMU-BSC/AnonymizationPipeline.
The anonymization pipeline is a library for performing sensitive data identification and ultimately anonymization of the detected data in Spanish and Catalan user generated plain text.
This is not a standalone model and is meant to work within the pipeline.
The model can detect the following entities: EMAIL
, FINANCIAL
, ID
, LOC
, MISC
, ORG
, PER
, TELEPHONE
, VEHICLE
, ZIP
Feature | Description |
---|---|
Name | ca_anonimization_core_lg |
Version | 1.0.0 |
spaCy | >=3.2.3,<4.0.0 |
Default Pipeline | tok2vec , morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | tok2vec , morphologizer , parser , attribute_ruler , lemmatizer , ner |
Vectors | 500000 keys, 500000 unique vectors (300 dimensions) |
Sources | n/a |
License | MIT |
Author | Joaquin Silveira |
Label Scheme
View label scheme (322 labels for 3 components)
Component | Labels |
---|---|
morphologizer |
Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art , POS=PROPN , POS=PUNCT|PunctSide=Ini|PunctType=Brck , POS=PUNCT|PunctSide=Fin|PunctType=Brck , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part , Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art , Gender=Fem|Number=Sing|POS=NOUN , POS=ADP , NumType=Card|Number=Plur|POS=NUM , Gender=Masc|Number=Plur|POS=NOUN , Number=Sing|POS=ADJ , POS=CCONJ , Gender=Fem|Number=Sing|POS=DET|PronType=Ind , NumForm=Digit|NumType=Card|POS=NUM , NumForm=Digit|POS=NOUN , Gender=Masc|Number=Plur|POS=ADJ , POS=PUNCT|PunctType=Comm , POS=AUX|VerbForm=Inf , Case=Acc,Dat|POS=PRON|Person=3|PrepCase=Npr|PronType=Prs|Reflex=Yes , Definite=Def|Gender=Masc|Number=Plur|POS=DET|PronType=Art , POS=PRON|PronType=Rel , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Sing|POS=DET|PronType=Art , Gender=Fem|Number=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs , Definite=Def|Gender=Fem|Number=Plur|POS=DET|PronType=Art , Gender=Fem|Number=Plur|POS=NOUN , Gender=Fem|Number=Plur|POS=ADJ , POS=VERB|VerbForm=Inf , Case=Acc,Dat|Number=Plur|POS=PRON|Person=3|PronType=Prs , Number=Plur|POS=ADJ , POS=PUNCT|PunctType=Peri , Number=Sing|POS=PRON|PronType=Rel , Gender=Masc|Number=Sing|POS=NOUN , Mood=Imp|Number=Sing|POS=VERB|Person=2|VerbForm=Fin , Gender=Masc|Number=Plur|POS=ADJ|VerbForm=Part , POS=SCONJ , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part , Definite=Def|Number=Sing|POS=DET|PronType=Art , Gender=Masc|Number=Sing|POS=DET|PronType=Ind , Gender=Fem|Number=Plur|POS=ADJ|VerbForm=Part , Gender=Masc|Number=Sing|POS=DET|PronType=Dem , POS=VERB|VerbForm=Ger , POS=NOUN , Gender=Fem|NumType=Card|Number=Sing|POS=NUM , Gender=Fem|Number=Sing|POS=ADJ|VerbForm=Part , Gender=Fem|NumType=Ord|Number=Plur|POS=ADJ , POS=SYM , Gender=Masc|Number=Sing|POS=ADJ , Gender=Masc|Number=Sing|POS=ADJ|VerbForm=Part , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Fem|Number=Sing|POS=DET|PronType=Dem , POS=ADV|Polarity=Neg , POS=ADV , Number=Sing|POS=PRON|PronType=Dem , Number=Sing|POS=NOUN , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Number=Plur|POS=NOUN , Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Sing|POS=ADJ , Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|POS=PRON|PronType=Tot , Case=Loc|POS=PRON|Person=3|PronType=Prs , Gender=Fem|NumType=Ord|Number=Sing|POS=ADJ , Degree=Cmp|POS=ADV , Gender=Fem|Number=Plur|POS=DET|PronType=Art , Gender=Fem|Number=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin , Gender=Masc|NumType=Ord|Number=Sing|POS=ADJ , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin , NumType=Card|POS=NUM , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Fut|VerbForm=Fin , Number=Sing|POS=PRON|PronType=Ind , Gender=Masc|Number=Sing|POS=DET|PronType=Art , Number=Plur|POS=DET|PronType=Ind , Mood=Sub|Number=Plur|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Plur|POS=DET|PronType=Dem , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Fut|VerbForm=Fin , Gender=Masc|NumType=Card|Number=Sing|POS=NUM , Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Case=Acc|Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Number=Sing|POS=DET|PronType=Ind , POS=PUNCT , Number=Sing|POS=DET|PronType=Rel , Case=Gen|POS=PRON|Person=3|PronType=Prs , Gender=Fem|NumType=Card|Number=Plur|POS=NUM , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , POS=DET|PronType=Ind , POS=AUX , Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Acc,Dat|Number=Plur|POS=PRON|Person=1|PronType=Prs , Degree=Cmp|Number=Sing|POS=ADJ , Number=Sing|POS=VERB , Gender=Masc|Number=Plur|POS=PRON|PronType=Ind , Gender=Fem|Number=Plur|POS=DET|PronType=Dem , Gender=Masc|Number=Plur|POS=DET|PronType=Art , Gender=Masc|Number=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs , Case=Acc|Gender=Fem,Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Gender=Fem|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part , Gender=Masc|Number=Sing|POS=PRON|PronType=Ind , Gender=Fem|Number=Plur|POS=PRON|PronType=Ind , Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Pres|VerbForm=Fin , Number=Plur|POS=PRON|PronType=Rel , Gender=Masc|Number=Plur|POS=DET|PronType=Int , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , AdvType=Tim|POS=NOUN , Gender=Masc|Number=Plur|POS=DET|PronType=Ind , Gender=Fem|Number=Plur|POS=DET|PronType=Ind , Gender=Masc|Number=Sing|POS=DET|PronType=Int , Mood=Cnd|Number=Sing|POS=AUX|Person=3|VerbForm=Fin , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , Number=Sing|POS=DET|PronType=Art , Gender=Masc|Number=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs , Case=Acc|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Gender=Masc|Number=Sing|POS=PRON|PronType=Int , POS=PUNCT|PunctType=Semi , Mood=Cnd|Number=Plur|POS=AUX|Person=3|VerbForm=Fin , Case=Dat|Number=Sing|POS=PRON|Person=3|PronType=Prs , Gender=Masc|NumType=Card|Number=Plur|POS=NUM , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Sing|POS=PRON|PronType=Ind , Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , NumForm=Digit|POS=SYM , Gender=Masc|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part , Gender=Fem|Number=Sing|POS=PRON|PronType=Int , Gender=Fem|Number=Sing|POS=DET|PronType=Int , POS=PRON|PronType=Int , Gender=Fem|Number=Plur|POS=DET|PronType=Int , Mood=Cnd|Number=Sing|POS=VERB|Person=3|VerbForm=Fin , Mood=Cnd|Number=Plur|POS=VERB|Person=3|VerbForm=Fin , POS=PART , Gender=Fem|Number=Sing|POS=PRON|PronType=Dem , Gender=Masc|Number=Sing|POS=DET|PronType=Tot , Gender=Masc|Number=Plur|POS=PRON|PronType=Dem , POS=ADJ , Gender=Masc|Number=Plur|POS=PRON|Person=3|PronType=Prs , Degree=Cmp|Number=Plur|POS=ADJ , POS=PUNCT|PunctType=Dash , Mood=Sub|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Case=Acc|Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs , Mood=Sub|Number=Sing|POS=VERB|Person=3|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part , Gender=Fem|Number=Sing|POS=PRON|Person=3|PronType=Prs , Gender=Masc|POS=NOUN , Mood=Ind|Number=Sing|POS=VERB|Person=3|Tense=Past|VerbForm=Fin , Gender=Fem|Number=Plur|POS=PRON|PronType=Int , Gender=Masc|NumType=Ord|Number=Plur|POS=ADJ , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin , POS=PUNCT|PunctType=Colo , Gender=Masc|NumType=Card|POS=NUM , Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Number=Sing|POS=PRON|PronType=Int , POS=PUNCT|PunctType=Quot , Mood=Imp|Number=Sing|POS=VERB|Person=3|VerbForm=Fin , Gender=Fem|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs , Gender=Masc|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin , POS=AUX|VerbForm=Ger , Gender=Fem|Number=Plur|POS=PRON|Person=3|PronType=Prs , Mood=Imp|Number=Sing|POS=AUX|Person=3|VerbForm=Fin , Number=Plur|POS=PRON|PronType=Ind , Gender=Masc|Number=Sing|POS=PRON|PronType=Dem , Case=Acc,Dat|Number=Sing|POS=PRON|Person=2|Polite=Infm|PrepCase=Npr|PronType=Prs , Gender=Masc|Number=Plur|POS=PRON|PronType=Int , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , NumForm=Digit|NumType=Frac|POS=NUM , POS=VERB , Gender=Fem|Number=Plur|POS=PRON|PronType=Dem , Gender=Fem|POS=NOUN , Case=Acc,Dat|Number=Sing|POS=PRON|Person=1|PrepCase=Npr|PronType=Prs , Mood=Sub|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Fut|VerbForm=Fin , Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Imp|VerbForm=Fin , Number=Plur|POS=PRON|Person=1|PronType=Prs , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , Case=Nom|Number=Sing|POS=PRON|Person=2|Polite=Infm|PronType=Prs , POS=X , Mood=Cnd|Number=Plur|POS=AUX|Person=1|VerbForm=Fin , Number=Sing|POS=DET|PronType=Dem , POS=DET , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Fut|VerbForm=Fin , Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , POS=DET|PronType=Art , Gender=Masc|Number=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs , NumType=Ord|Number=Sing|POS=ADJ , Gender=Fem|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part , Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs , Gender=Fem|Number=Plur|POS=AUX|Tense=Past|VerbForm=Part , Gender=Masc|Number=Plur|POS=AUX|Tense=Past|VerbForm=Part , Number=Plur|POS=PRON|PronType=Dem , Mood=Imp|Number=Plur|POS=VERB|Person=1|VerbForm=Fin , POS=PRON|PronType=Ind , Mood=Ind|Number=Sing|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Mood=Imp|Number=Plur|POS=VERB|Person=3|VerbForm=Fin , Case=Nom|Number=Sing|POS=PRON|Person=1|PronType=Prs , Case=Acc|Number=Sing|POS=PRON|Person=1|PrepCase=Pre|PronType=Prs , Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , POS=PUNCT|PunctSide=Fin|PunctType=Qest , NumForm=Digit|NumType=Ord|POS=ADJ , Case=Acc|POS=PRON|Person=3|PrepCase=Pre|PronType=Prs|Reflex=Yes , NumForm=Digit|NumType=Frac|POS=SYM , Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Gender=Masc|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs , Gender=Masc|Number=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs , Mood=Sub|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin , POS=PUNCT|PunctSide=Ini|PunctType=Qest , NumType=Card|Number=Sing|POS=NUM , Foreign=Yes|POS=PRON|PronType=Int , Foreign=Yes|Mood=Ind|POS=VERB|VerbForm=Fin , Foreign=Yes|POS=ADP , Gender=Masc|Number=Sing|POS=PROPN , POS=PUNCT|PunctSide=Ini|PunctType=Excl , POS=PUNCT|PunctSide=Fin|PunctType=Excl , Mood=Cnd|Number=Sing|POS=AUX|Person=1|VerbForm=Fin , Number=Plur|POS=PRON|Person=2|Polite=Form|PronType=Prs , Mood=Sub|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , POS=PUNCT|PunctSide=Ini|PunctType=Comm , POS=PUNCT|PunctSide=Fin|PunctType=Comm , Number=Plur|POS=PRON|Person=2|PronType=Prs , Mood=Ind|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin , Case=Acc,Dat|Number=Plur|POS=PRON|Person=2|PronType=Prs , Mood=Cnd|Number=Sing|POS=VERB|Person=1|VerbForm=Fin , Mood=Cnd|Number=Plur|POS=VERB|Person=1|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , Gender=Masc|Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs , Definite=Ind|Gender=Masc|Number=Sing|POS=DET|PronType=Art , Number=Sing|POS=PRON|Person=2|Polite=Form|PronType=Prs , Gender=Masc|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs , Mood=Ind|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , POS=VERB|Tense=Past|VerbForm=Part , Mood=Imp|Number=Plur|POS=AUX|Person=3|VerbForm=Fin , Case=Nom|POS=PRON|Person=3|PronType=Prs , Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Past|VerbForm=Fin , Gender=Fem|Number=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs , Gender=Masc|Number=Sing|POS=PRON|PronType=Rel , Definite=Ind|Number=Sing|POS=DET|PronType=Art , Gender=Masc|Number=Sing|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs , Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs , POS=AUX|Tense=Past|VerbForm=Part , Gender=Fem|NumType=Card|POS=NUM , Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , Mood=Sub|Number=Sing|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs , Mood=Ind|Number=Sing|POS=AUX|Person=1|Tense=Fut|VerbForm=Fin , Mood=Ind|Number=Plur|POS=AUX|Person=3|Tense=Past|VerbForm=Fin , AdvType=Tim|Degree=Cmp|POS=ADV , Case=Acc|Number=Sing|POS=PRON|Person=2|Polite=Infm|PrepCase=Pre|PronType=Prs , POS=DET|PronType=Rel , Definite=Ind|Gender=Fem|Number=Plur|POS=DET|PronType=Art , Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin , POS=INTJ , Mood=Sub|Number=Sing|POS=AUX|Person=1|Tense=Pres|VerbForm=Fin , POS=VERB|VerbForm=Fin , Mood=Ind|Number=Plur|POS=VERB|Person=3|Tense=Past|VerbForm=Fin , Definite=Ind|Gender=Fem|Number=Sing|POS=DET|PronType=Art , Mood=Sub|Number=Plur|POS=AUX|Person=1|Tense=Imp|VerbForm=Fin , Gender=Fem|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs , Mood=Sub|Number=Sing|POS=VERB|Person=2|Tense=Pres|VerbForm=Fin , Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes , Foreign=Yes|POS=NOUN , Foreign=Yes|Mood=Ind|Number=Sing|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin , Foreign=Yes|Gender=Masc|Number=Sing|POS=PRON|Person=3|PronType=Prs , Foreign=Yes|POS=SCONJ , Foreign=Yes|Gender=Fem|Number=Sing|POS=DET|PronType=Art , Gender=Masc|POS=SYM , Gender=Fem|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs , Number=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs , Gender=Masc|Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs , Gender=Fem|Number=Sing|POS=PROPN , Mood=Sub|Number=Plur|POS=VERB|Person=1|Tense=Imp|VerbForm=Fin , Definite=Def|Foreign=Yes|Gender=Masc|Number=Sing|POS=DET|PronType=Art , Foreign=Yes|POS=VERB , Foreign=Yes|POS=ADJ , Foreign=Yes|POS=DET , Foreign=Yes|POS=ADV , POS=PUNCT|PunctSide=Fin|Punta d'aignctType=Brck , Degree=Cmp|POS=ADJ , AdvType=Tim|POS=SYM , Number=Plur|POS=DET|PronType=Dem , Mood=Ind|Number=Sing|POS=VERB|Person=2|Tense=Fut|VerbForm=Fin |
parser |
ROOT , acl , advcl , advmod , amod , appos , aux , case , cc , ccomp , compound , conj , cop , csubj , dep , det , expl:pass , fixed , flat , iobj , mark , nmod , nsubj , nummod , obj , obl , parataxis , punct , xcomp |
ner |
EMAIL , FINANCIAL , ID , LOC , MISC , ORG , PER , TELEPHONE , VEHICLE , ZIP |
Accuracy
Type | Score |
---|---|
ENTS_F |
69.12 |
ENTS_P |
74.60 |
ENTS_R |
64.38 |
NER_LOSS |
26573.78 |
- Downloads last month
- 430
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- NER Precisionself-reported0.746
- NER Recallself-reported0.644
- NER F Scoreself-reported0.691