Dataset Used

This model was trained on the CoNLL 2003 dataset for Named Entity Recognition (NER) tasks.

The dataset includes the following labels:

  • O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC

For detailed descriptions of these labels, please refer to the dataset card.

Model Training Details

Training Arguments

  • Model Architecture: bert-base-cased for token classification
  • Learning Rate: 2e-5
  • Number of Epochs: 20
  • Weight Decay: 0.01
  • Evaluation Strategy: epoch
  • Save Strategy: epoch

Additional default parameters from the Hugging Face Transformers library were used.

Evaluation Results

Validation Set Performance

  • Overall Metrics:
    • Precision: 94.44%
    • Recall: 95.74%
    • F1 Score: 95.09%
    • Accuracy: 98.73%

Per-Label Performance

Entity Type Precision Recall F1 Score
LOC 97.27% 97.11% 97.19%
MISC 87.46% 91.54% 89.45%
ORG 93.37% 93.44% 93.40%
PER 96.02% 98.15% 97.07%

Test Set Performance

  • Overall Metrics:
    • Precision: 89.90%
    • Recall: 91.91%
    • F1 Score: 90.89%
    • Accuracy: 97.27%

Per-Label Performance

Entity Type Precision Recall F1 Score
LOC 92.87% 92.87% 92.87%
MISC 75.55% 82.76% 78.99%
ORG 88.32% 90.61% 89.45%
PER 95.28% 96.23% 95.75%

How to Use the Model

You can load the model directly from the Hugging Face Model Hub:

from transformers import pipeline

# Replace with your specific model checkpoint
model_checkpoint = "Prikshit7766/bert-finetuned-ner"
token_classifier = pipeline(
    "token-classification", 
    model=model_checkpoint, 
    aggregation_strategy="simple"
)

# Example usage
result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
print(result)

Example Output

[
   {
      "entity_group":"PER",
      "score":0.9999881,
      "word":"Sylvain",
      "start":11,
      "end":18
   },
   {
      "entity_group":"ORG",
      "score":0.99961376,
      "word":"Hugging Face",
      "start":33,
      "end":45
   },
   {
      "entity_group":"LOC",
      "score":0.99989843,
      "word":"Brooklyn",
      "start":49,
      "end":57
   }
]
Downloads last month
8
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Prikshit7766/bert-finetuned-ner

Finetuned
(2001)
this model

Dataset used to train Prikshit7766/bert-finetuned-ner