Danish BERT for hate speech classification

The BERT HateSpeech model classifies offensive Danish text into 4 categories:

  • Særlig opmærksomhed (special attention, e.g. threat)
  • Personangreb (personal attack)
  • Sprogbrug (offensive language)
  • Spam & indhold (spam) This model is intended to be used after the BERT HateSpeech detection model.

It is based on the pretrained Danish BERT model by BotXO which has been fine-tuned on social media data.

See the DaNLP documentation for more details.

Here is how to use the model:

from transformers import BertTokenizer, BertForSequenceClassification

model = BertForSequenceClassification.from_pretrained("alexandrainst/da-hatespeech-classification-base")
tokenizer = BertTokenizer.from_pretrained("alexandrainst/da-hatespeech-classification-base")

Training data

The data used for training has not been made publicly available. It consists of social media data manually annotated in collaboration with Danmarks Radio.

Downloads last month
52
Safetensors
Model size
111M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.