syubraj commited on
Commit
440b040
·
verified ·
1 Parent(s): cd88814

syubraj/espanyol_bert_based_cased_ner

Browse files
Files changed (1) hide show
  1. README.md +13 -79
README.md CHANGED
@@ -7,15 +7,6 @@ tags:
7
  model-index:
8
  - name: fine_tune_bert_output
9
  results: []
10
- datasets:
11
- - unimelb-nlp/wikiann
12
- language:
13
- - es
14
- metrics:
15
- - recall
16
- - precision
17
- - f1
18
- pipeline_tag: token-classification
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -34,75 +25,18 @@ It achieves the following results on the evaluation set:
34
  - Org F1: 0.8663
35
  - Per F1: 0.9367
36
 
37
- ## Labels
38
- The following table represents the labels used by the model along with their corresponding indices:
39
-
40
- | Index | Label |
41
- |-------|---------|
42
- | 0 | O |
43
- | 1 | B-PER |
44
- | 2 | I-PER |
45
- | 3 | B-ORG |
46
- | 4 | I-ORG |
47
- | 5 | B-LOC |
48
- | 6 | I-LOC |
49
-
50
- ### Label Descriptions
51
- - **O**: Outside of a named entity.
52
- - **B-PER**: Beginning of a person's name.
53
- - **I-PER**: Inside a person's name.
54
- - **B-ORG**: Beginning of an organization's name.
55
- - **I-ORG**: Inside an organization's name.
56
- - **B-LOC**: Beginning of a location name.
57
- - **I-LOC**: Inside a location name.
58
-
59
- ## Inference Example
60
- ```python
61
- from transformers import AutoModelForTokenClassification, AutoTokenizer, pipeline
62
-
63
- # Load the model and tokenizer
64
- model_name = "syubraj/espanyol_bert_based_ner"
65
- model = AutoModelForTokenClassification.from_pretrained(model_name)
66
- tokenizer = AutoTokenizer.from_pretrained(model_name)
67
-
68
- # Custom label mapping
69
- custom_label_mapping = {
70
- 0: 'O', # Outside of any named entity
71
- 1: 'B-PER', # Beginning of a person's name
72
- 2: 'I-PER', # Inside a person's name
73
- 3: 'B-ORG', # Beginning of an organization's name
74
- 4: 'I-ORG', # Inside an organization's name
75
- 5: 'B-LOC', # Beginning of a location's name
76
- 6: 'I-LOC', # Inside a location's name
77
- }
78
-
79
- # pipeline for Named Entity Recognition (NER)
80
- ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer)
81
-
82
- # Input text
83
- text = "Donald trabaja en Twitter"
84
-
85
- raw_results = ner_pipeline(text)
86
-
87
- results_with_labels = []
88
- for entity in raw_results:
89
- label_index = int(entity['entity'].split('_')[1])
90
-
91
- entity_label = custom_label_mapping.get(label_index, "UNKNOWN")
92
-
93
- entity_with_label = {
94
- "entity": entity_label,
95
- "word": entity["word"],
96
- "start": entity["start"],
97
- "end": entity["end"],
98
- "score": entity["score"],
99
- }
100
- results_with_labels.append(entity_with_label)
101
-
102
- print("NER Results:")
103
- for result in results_with_labels:
104
- print(result)
105
- ```
106
  ## Training procedure
107
 
108
  ### Training hyperparameters
@@ -139,4 +73,4 @@ The following hyperparameters were used during training:
139
  - Transformers 4.44.2
140
  - Pytorch 2.4.1+cu121
141
  - Datasets 3.2.0
142
- - Tokenizers 0.19.1
 
7
  model-index:
8
  - name: fine_tune_bert_output
9
  results: []
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
25
  - Org F1: 0.8663
26
  - Per F1: 0.9367
27
 
28
+ ## Model description
29
+
30
+ More information needed
31
+
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  ## Training procedure
41
 
42
  ### Training hyperparameters
 
73
  - Transformers 4.44.2
74
  - Pytorch 2.4.1+cu121
75
  - Datasets 3.2.0
76
+ - Tokenizers 0.19.1