🧠 BERT Classifier for Black Article Detection

πŸ“ Model Overview

This repository hosts a fine-tuned BERT model (bert-base-uncased) for classifying sentences as being about Black entities. It is common in social sciences to analyse semantic and topic variation by race. Some racial labels do not require complementary language to disseminate meaning, e.g. "Negro". However, mentions of the word "Black" referencing Black people may be confused with many other meanings. This model identifies if a sentence contains a "Black" entity (i.e. person, group or organisation). The training dataset is also provided for reproducibility.

πŸ“– Description

  • Model: Fine-tuned bert-base-uncased
  • Training Data: 2,000 manually labeled sentences from historical newspaper articles (1960–1973)
  • Inputs: sentence (string)
  • Outputs: black_story (0 or 1)

πŸ“Š Performance Metrics

  • Training Accuracy: 93.5%
  • Validation Accuracy: 91.2%
  • Precision: 90.8%
  • Recall: 92.1%

πŸš€ Usage Instructions

from transformers import pipeline
classifier = pipeline("text-classification", model="mikemcrae/black-entity-classifier")
result = classifier("Black activists led a peaceful protest downtown.")
print(result)

πŸ’Ύ Training Dataset

πŸ“Š Example Data Preview

sentence,black_story
"The Black Panthers organized a march for civil rights.",1
"The mayor discussed the city's budget for next year.",0
"Black students protested against segregation policies.",1
"Black car for sale.",0

## βš™οΈ Reproduction Instructions
```python
from datasets import load_dataset
from transformers import BertForSequenceClassification, Trainer, TrainingArguments

dataset = load_dataset("mikemcrae/black-article-training-data")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

πŸ“œ License

MIT License: Free to use with attribution.

MIT License Β© 2025 Mike McRae
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

❀️ Citation

@inproceedings{mcrae2025blackbert,
  title={BERT Classifier for Black Article Detection},
  author={Mike McRae},
  year={2025},
  url={https://huggingface.co/mikemcrae/black-article-classifier}
}
Downloads last month
5
Safetensors
Model size
109M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for mikemcrae25/black_entity_classifier

Finetuned
(3343)
this model

Dataset used to train mikemcrae25/black_entity_classifier