Fine-tuned distilroberta-base for detecting news on the labor movement

Model Description

This model is a finetuned distilroberta-base, for classifying whether news articles are about the labor movement.

How to Use

from transformers import pipeline
classifier = pipeline("text-classification", model="dell-research-harvard/topic-labor_movement")
classifier("Strikes in Pittsburgh")

Training data

The model was trained on a hand-labelled sample of data from the NEWSWIRE dataset.

Split Size
Train 253
Dev 54
Test 54

Test set results

Metric Result
F1 0.9412
Accuracy 0.9630
Precision 0.9412
Recall 0.9412

Citation Information

You can cite this dataset using

@misc{silcock2024newswirelargescalestructureddatabase,
      title={Newswire: A Large-Scale Structured Database of a Century of Historical News}, 
      author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
      year={2024},
      eprint={2406.09490},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.09490}, 
}

Applications

We applied this model to a century of historical news articles. You can see all the classifications in the NEWSWIRE dataset.

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.