File size: 2,654 Bytes
d12044e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e13fdfa
d12044e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
tags:
- spacy
- token-classification
language:
- sr
license: cc-by-sa-3.0
model-index:
- name: sr_pln_tesla_j125
  results:
  - task:
      name: NER
      type: token-classification
    metrics:
    - name: NER Precision
      type: precision
      value: 0.9470398711
    - name: NER Recall
      type: recall
      value: 0.9544716547
    - name: NER F Score
      type: f_score
      value: 0.9507412399
  - task:
      name: TAG
      type: token-classification
    metrics:
    - name: TAG (XPOS) Accuracy
      type: accuracy
      value: 0.9834346621
  - task:
      name: LEMMA
      type: token-classification
    metrics:
    - name: Lemma Accuracy
      type: accuracy
      value: 0.9816790168
---
sr_pln_tesla_j125 is a spaCy model meticulously fine-tuned for Part-of-Speech Tagging, Lemmatization, and Named Entity Recognition in Serbian language texts. This advanced model incorporates a transformer layer based on Jerteh125, enhancing its analytical capabilities. It is proficient in identifying 7 distinct categories of entities: PERS (persons), ROLE (professions), DEMO (demonyms), ORG (organizations), LOC (locations), WORK (artworks), and EVENT (events). Detailed information about these categories is available in the accompanying table. The development of this model has been made possible through the support of the Science Fund of the Republic of Serbia, under grant #7276, for the project 'Text Embeddings - Serbian Language Applications - TESLA'.

| Feature | Description |
| --- | --- |
| **Name** | `sr_pln_tesla_j125` |
| **Version** | `1.0.0` |
| **spaCy** | `>=3.7.2,<3.8.0` |
| **Default Pipeline** | `transformer`, `tagger`, `trainable_lemmatizer`, `ner` |
| **Components** | `transformer`, `tagger`, `trainable_lemmatizer`, `ner` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | `CC BY-SA 3.0` |
| **Author** | [Milica Ikonić Nešić, Saša Petalinkar, Mihailo Škorić, Ranka Stanković](https://tesla.rgf.bg.ac.rs/) |

### Label Scheme

<details>

<summary>View label scheme (23 labels for 2 components)</summary>

| Component | Labels |
| --- | --- |
| **`tagger`** | `ADJ`, `ADP`, `ADV`, `AUX`, `CCONJ`, `DET`, `INTJ`, `NOUN`, `NUM`, `PART`, `PRON`, `PROPN`, `PUNCT`, `SCONJ`, `VERB`, `X` |
| **`ner`** | `DEMO`, `EVENT`, `LOC`, `ORG`, `PERS`, `ROLE`, `WORK` |

</details>

### Accuracy

| Type | Score |
| --- | --- |
| `TAG_ACC` | 98.34 |
| `LEMMA_ACC` | 98.17 |
| `ENTS_F` | 95.07 |
| `ENTS_P` | 94.70 |
| `ENTS_R` | 95.45 |
| `TRANSFORMER_LOSS` | 251816.91 |
| `TAGGER_LOSS` | 43163.04 |
| `TRAINABLE_LEMMATIZER_LOSS` | 115443.69 |
| `NER_LOSS` | 23281.59 |