Stereotype detection at aequa-tech
Model Description
- Developed by: aequa-tech
- Funded by: NGI-Search
- Language(s) (NLP): Italian
- License: apache-2.0
- Finetuned from model: AlBERTo
This model is a fine-tuned version of AlBERTo Italian model on stereotypes detection
Training Details
Training Data
- HaSpeeDe 2020
- Sarcastic Hate Speech dataset
- Racial stereotypes corpus available upon request to the authors of A Multilingual Dataset of Racial Stereotypes in Social Media Conversational Threads
- Debunker-Assistant corpus
Training Hyperparameters
- learning_rate: 2e-5
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam
Evaluation
Testing Data
It was tested on HaSpeeDe test sets (tweets and news headlines) obtaining the following results:
Metrics and Results
Tweets:
- macro F1: 0.75
- accuracy: 0.75
- precision of positive class: 0.66
- recall of positive class: 0.94
- F1 of positive class: 0.78
News Headlines:
- macro F1: 0.72
- accuracy: 0.77
- precision of positive class: 0.73
- recall of positive class: 0.52
- F1 of positive class: 0.61
Framework versions
- Transformers 4.30.2
- Pytorch 2.1.2
- Datasets 2.19.0
- Accelerate 0.30.0
How to use this model:
model = AutoModelForSequenceClassification.from_pretrained('aequa-tech/stereotype-it',num_labels=2)
tokenizer = AutoTokenizer.from_pretrained("m-polignano-uniba/bert_uncased_L-12_H-768_A-12_italian_alb3rt0")
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
classifier("text")
- Downloads last month
- 573
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.