Fusion NER Models

community

AI & ML interests

NLP, NER

Recent Activity

etzion  updated a Space 9 days ago
FusioNER/README
etzion  updated a Space 9 days ago
FusioNER/README
yehoshuadiller  updated a Space 10 months ago
FusioNER/README
View all activity

Fusion NER Models

Here you can find NER models for Fusion project!

Table of content:

  1. NER-Models
  2. Results
  3. Hebrew NLP models
  4. Footnotes

NER Models:

Here you can find a description on each of our models. Each row contains the model nickname, training description, model path (LINK), source dataset (with LINK), base model and entity types.

model name model description model path datasets link to dataset base model entity types trainer
Basic Basic training on IAHALT FusioNER/Basic_IAHALT IAHALT FusioNER/Basic HeRo classic[4] Etzion
Vitaly Vitaly training on IAHALT (with BI-BI problem) FusioNER/Vitaly_NER IAHALT FusioNER/Vitaly HeRo classic[4] Vitaly
Name-Sentences Training on IAHALT + Name-Sentences FusioNER/Name-Sentences IAHALT FusioNER/Name_Sentences HeRo classic[4] Etzion
Entity-Injection Training on IAHALT + Entity-Injection FusioNER/Entity-Injection IAHALT FusioNER/Entity_Injection HeRo classic[4] Etzion
Smart_Injection Training on IAHALT + Name-Sentences + Entity-Injection FusioNER/Smart_Injection IAHALT FusioNER/Smart_Injection HeRo classic[4] Etzion
NEMO Basic training on NEMO dataset FusioNER/Nemo NEMO FusioNER/NEMO HeRo classic[4] Etzion
IAHALT_and_NEMO Basic training on IAHALT + NEMO FusioNER/IAHALT_and_NEMO IAHALT + NEMO FusioNER/IAHALT_and_NEMO HeRo classic[4] Etzion
IAHALT_and_NEMO_PP Training on IAHALT + NEMO + Name-Sentences + Entity-Injection FusioNER/IAHALT_and_NEMO_and_PP IAHALT + NEMO FusioNER/IAHALT_and_NEMO_PP HeRo classic[4] Etzion
Animals Training on IAHALT + Entity-Injection (of animals names as PER entities) FusioNER/Animals IAHALT FusioNER/Animals HeRo classic[4] Etzion
PRS-Injection Training on IAHALT + Entity-Injection (of PRS names as PER entities) FusioNER/PRS-Injection IAHALT FusioNER/PRS_locations HeRo classic[4] Etzion
DICTA_Basic Training the DICTA model on the basic IAHALT dataset FusioNER/Dicta_Small_Basic IAHALT FusioNER/Smart_Injection DICTA classic[4] Etzion
DICTA_Small_Smart Training the DICTA model on IAHALT + Name-Sentences + Entity-Injection] dataset FusioNER/Dicta_Small_Smart IAHALT FusioNER/Smart_Injection DICTA classic[4] Etzion
DICTA_basic_NER Training the DICTA-ner model on the basic IAHALT dataset FusioNER/DICTA_basic IAHALT FusioNER/Basic DICTA-ner classic[4] Etzion
DICTA_smart_NER Training the DICTA-ner model on IAHALT + Name-Sentences + Entity-Injection] dataset FusioNER/DICTA_Smart IAHALT FusioNER/Smart_Injection DICTA-ner classic[4] Etzion
DICTA_Large_Smart Training the DICTA Large model on IAHALT + Name-Sentences + Entity-Injection] dataset FusioNER/Dicta_Large_Smart IAHALT FusioNER/Smart_Injection DICTA Large classic[4] Etzion
TEC_NER Basic technology NER model FusioNER/tec_ner TEC_NER FusioNER/tec_ner base model TEC Yehoshua

Results

We test our models on the IAHALT test set. We also check another models, such as DictaBert and HeBert. This is the performence results:

Model name Precision Recall F1 - Score Time (in seconds)
IAHALT_and_NEMO_PP 0.714 0.353 0.461 83.128
HeBert 0.574 0.474 0.494 86.483
NEMO 0.553 0.51 0.525 81.422
IAHALT_and_NEMO 0.692 0.678 0.684 83.702
Vitaly 0.883 0.794 0.836 83.773
DictaBert 0.916 0.834 0.872 70.465
DICTA_large 0.917 0.845 0.879 206.251
Name-Sentences 0.895 0.865 0.879 82.674
Basic 0.897 0.866 0.881 84.479
Smart_Injection 0.898 0.867 0.881 82.253
DICTA_Basic 0.903 0.875 0.888 69.419
DICTA_Large_Smart 0.904 0.875 0.889 204.324
DICTA_Small_Smart 0.904 0.875 0.889 70.29

According to the results, we recommend to use DICTA_Small_Smart model.

Hebrew NLP models

You can find in the table Hebrew NLP models:

Model name Link Creator
HeNLP/HeRo https://huggingface.co/HeNLP/HeRo Vitaly Shalumov and Harel Haskey
dicta-il/dictabert https://huggingface.co/dicta-il/dictabert Shaltiel Shmidman and Avi Shmidman and Moshe Koppel
dicta-il/dictabert-large https://huggingface.co/dicta-il/dictabert-large Shaltiel Shmidman and Avi Shmidman and Moshe Koppel
avichr/heBERT https://huggingface.co/avichr/heBERT Avihay Chriqui and Inbal Yahav

Footnotes

[1] Name-Sentences:

Adding to the corpus sentences that contain only the entity we want the network to learn.

[2] Entity-Injection:

Replace a tagged entity in the original corpus with a new entity. By using, this method, the model can learn new entities (not labels!) which the model not extracted before.

[3] BI-BI Problem:

Building training corpus when entities from the same type appear in sequence, labeled as continuations of one another. For example, the text "הארי פוטר ורון וויזלי" would tagged as SINGLE entity. That problem prevent the model to extract entities correctly.

[4] Classic:

The classic NER types:

entity type full name examples
PER Person אדולף היטלר, רודולף הס, מרדכי אנילביץ
GPE Geopolitical Entity גרמניה, פולין, ברלין, וורשה
LOC Location מזרח אירופה, אגן הים התיכון, הגליל
FAC Facility אוושוויץ, מגדלי התאומים, נתב"ג 2000, רחוב קפלן
ORG Organization המפלגה הנאצית, חברת גוגל, ממשלת חוף השנהב
TIMEX Time Expression 1945, שנת 1993, יום השואה, שנות ה-90
EVE Event השואה, מלחמת העולם השנייה, שלטון האפרטהייד
TTL Title פיהרר, קיסר, מנכ"ל
ANG Language עברית, ערבית, גרמנית
DUC Product פייסבוק, F-16, תנובה
WOA Work of Art דו"ח מבקר המדינה, עיתון הארץ, הארי פוטר, תיק 2000,
MISC Miscellaneous קורונה, התו הירוק, מדלית זהב, ביטקוין

Datasets for English NER (for cleaning wrong entities for english texts):

MIT License

models

None public yet

datasets

None public yet