outputs
This model is a fine-tuned version of gerulata/slovakbert on the ju-bezdek/conll2003-SK-NER dataset. It achieves the following results on the evaluation (validation) set:
- Loss: 0.1752
- Precision: 0.8190
- Recall: 0.8390
- F1: 0.8288
- Accuracy: 0.9526
Model description
More information needed
Code example
from transformers import pipeline, AutoModel, AutoTokenizer
from spacy import displacy
import os
model_path="ju-bezdek/slovakbert-conll2003-sk-ner"
aggregation_strategy="max"
ner_pipeline = pipeline(task='ner', model=model_path, aggregation_strategy=aggregation_strategy)
input_sentence= "Ruský premiér Viktor Černomyrdin v piatok povedal, že prezident Boris Jeľcin , ktorý je na dovolenke mimo Moskvy , podporil mierový plán šéfa bezpečnosti Alexandra Lebedu pre Čečensko, uviedla tlačová agentúra Interfax"
ner_ents = ner_pipeline(input_sentence)
print(ner_ents)
ent_group_labels = [ner_pipeline.model.config.id2label[i][2:] for i in ner_pipeline.model.config.id2label if i>0]
options = {"ents":ent_group_labels}
dicplacy_ents = [{"start":ent["start"], "end":ent["end"], "label":ent["entity_group"]} for ent in ner_ents]
displacy.render({"text":input_sentence, "ents":dicplacy_ents}, style="ent", options=options, jupyter=True, manual=True)
Result:
Ruský
MISC
premiér
Viktor Černomyrdin
PER
v piatok povedal, že prezident
Boris Jeľcin,
PER
, ktorý je na dovolenke mimo
Moskvy
LOC
, podporil mierový plán šéfa bezpečnosti
Alexandra Lebedu
PER
pre
Čečensko,
LOC
uviedla tlačová agentúra
Interfax
ORG
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|
0.3237 | 1.0 | 878 | 0.2541 | 0.7125 | 0.8059 | 0.7563 | 0.9283 |
0.1663 | 2.0 | 1756 | 0.2370 | 0.7775 | 0.8090 | 0.7929 | 0.9394 |
0.1251 | 3.0 | 2634 | 0.2289 | 0.7732 | 0.8029 | 0.7878 | 0.9385 |
0.0984 | 4.0 | 3512 | 0.2818 | 0.7294 | 0.8189 | 0.7715 | 0.9294 |
0.0808 | 5.0 | 4390 | 0.3138 | 0.7615 | 0.7900 | 0.7755 | 0.9326 |
0.0578 | 6.0 | 5268 | 0.3072 | 0.7548 | 0.8222 | 0.7871 | 0.9370 |
0.0481 | 7.0 | 6146 | 0.2778 | 0.7897 | 0.8156 | 0.8025 | 0.9408 |
0.0414 | 8.0 | 7024 | 0.3336 | 0.7695 | 0.8201 | 0.7940 | 0.9389 |
0.0268 | 9.0 | 7902 | 0.3294 | 0.7868 | 0.8140 | 0.8002 | 0.9409 |
0.0204 | 10.0 | 8780 | 0.3693 | 0.7657 | 0.8239 | 0.7938 | 0.9376 |
0.016 | 11.0 | 9658 | 0.3816 | 0.7932 | 0.8242 | 0.8084 | 0.9425 |
0.0108 | 12.0 | 10536 | 0.3607 | 0.7929 | 0.8256 | 0.8089 | 0.9431 |
0.0078 | 13.0 | 11414 | 0.3980 | 0.7915 | 0.8240 | 0.8074 | 0.9423 |
0.0062 | 14.0 | 12292 | 0.4096 | 0.7995 | 0.8247 | 0.8119 | 0.9436 |
0.0035 | 15.0 | 13170 | 0.4177 | 0.8006 | 0.8251 | 0.8127 | 0.9438 |
Framework versions
- Transformers 4.15.0
- Pytorch 1.10.1+cu102
- Datasets 1.17.0
- Tokenizers 0.10.3
- Downloads last month
- 6
Dataset used to train ju-bezdek/slovakbert-conll2003-sk-ner
Evaluation results
- Precision on ju-bezdek/conll2003-SK-NERself-reported0.819
- Recall on ju-bezdek/conll2003-SK-NERself-reported0.839
- F1 on ju-bezdek/conll2003-SK-NERself-reported0.829
- Accuracy on ju-bezdek/conll2003-SK-NERself-reported0.953