Model Card: BERT for Named Entity Recognition (NER)
Model Overview
This model, bert-conll-ner, is a fine-tuned version of bert-base-uncased
trained for the task of Named Entity Recognition (NER) using the CoNLL-2003 dataset. It is designed to identify and classify entities in text, such as person names (PER), organizations (ORG), locations (LOC), and miscellaneous (MISC) entities.
Model Architecture
- Base Model: BERT (Bidirectional Encoder Representations from Transformers) with the
bert-base-uncased
architecture. - Task: Token Classification (NER).
Training Dataset
- Dataset: CoNLL-2003, a standard dataset for NER tasks containing sentences annotated with named entity spans.
- Classes:
PER
(Person)ORG
(Organization)LOC
(Location)MISC
(Miscellaneous)O
(Outside of any entity span)
Performance Metrics
The model demonstrates strong performance metrics on the CoNLL-2003 evaluation set:
Metric | Value |
---|---|
Loss | 0.0649 |
Precision | 93.59% |
Recall | 95.07% |
F1 Score | 94.32% |
Accuracy | 98.79% |
These metrics indicate the model's high accuracy and robustness in identifying and classifying entities.
Training Details
- Optimizer: AdamW (Adam with weight decay)
- Learning Rate: 2e-5
- Batch Size: 8
- Number of Epochs: 3
- Scheduler: Linear scheduler with warm-up steps
- Loss Function: Cross-entropy loss with ignored index (
-100
) for padding tokens
Model Input/Output
- Input Format: Tokenized text with special tokens
[CLS]
and[SEP]
. - Output Format: Token-level predictions with corresponding labels from the NER tag set (
B-PER
,I-PER
, etc.).
How to Use the Model
Installation
pip install transformers
Loading the Model
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("sfarrukh/modernbert-conll-ner")
model = AutoModelForTokenClassification.from_pretrained("sfarrukh/modernbert-conll-ner")
Running Inference
from transformers import pipeline
nlp = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
text = "John lives in New York City."
result = nlp(text)
print(result)
[{'entity_group': 'PER',
'score': 0.99912304,
'word': 'john',
'start': 0,
'end': 4},
{'entity_group': 'LOC',
'score': 0.9993351,
'word': 'new york city',
'start': 14,
'end': 27}]
Limitations
- Domain-Specific Adaptability: Performance might drop on domain-specific texts (e.g., legal or medical) not covered by the CoNLL-2003 dataset.
- Ambiguity: Ambiguous entities or overlapping spans are not explicitly handled.
Recommendations
- For domain-specific tasks, consider fine-tuning this model further on a relevant dataset.
- Use a pre-processing pipeline to handle long texts by splitting them into smaller segments.
Acknowledgements
- Transformers Library: Hugging Face
- Dataset: CoNLL-2003
- Base Model:
bert-base-uncased
by Google
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for sfarrukh/bert-conll-ner
Base model
google-bert/bert-base-uncased