marcosgg's picture
Update README.md
dcb96ee
|
raw
history blame
1.43 kB
metadata
license: gpl-3.0
language:
  - pt
  - gl
widget:
  - text: >-
      A minha amiga Rosa, de São Paulo, estudou en Montreal. Agora trabalha em
      Santiago de Compostela com o Mário.

Named Entity Recognition (NER) model for Portuguese

This is a NER model for Portuguese which uses the standard 'enamex' classes: LOC (geographical locations); PER (people); ORG (organizations); MISC (other entities).

The model is based on BERTimbau Base, which has been fine-tuned using a combination of available corpus (see [1] for details).

There is an alternative model trained using (BERTimbau Large)[https://huggingface.co/neuralmind/bert-large-portuguese-cased]: (bert-large-pt-ner-enamex)[https://huggingface.co/marcosgg/bert-large-pt-ner-enamex].

It was trained with a batch size of 8 and a learning rate of 2e-5 during 3 epochs. It achieved the following results on the test set (Precision/Recall/F1): 0.913/0.918/0.915.

[1] Pablo Gamallo, Marcos Garcia & Patricia Martín-Rodilla, 2019. NER and open information extraction for Portuguese notebook for IberLEF 2019 Portuguese named entity recognition and relation extraction tasks. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019) co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019): 457-467.