cirimus/modernbert-large-bias-type-classifier

Overview

This model was fine-tuned from ModernBERT-large on a synthetic dataset of biased statements and questions, generated by Mistal 7B as part of the GUS-Net paper. The model is designed to identify and classify text bias into multiple categories, including racial, religious, gender, age, and other biases, making it a valuable tool for bias detection and mitigation in natural language processing tasks.

Model Details

Base Model: ModernBERT-large
Fine-Tuning Dataset: Synthetic biased corpus
Number of Labels: 11
Problem Type: Multi-label classification
Language: English
License: MIT
Fine-Tuning Framework: Hugging Face Transformers

Example Usage

Here’s how to use the model with Hugging Face Transformers:

from transformers import pipeline

# Load the model
classifier = pipeline(
    "text-classification",
    model="cirimus/modernbert-large-bias-type-classifier",
    return_all_scores=True
)

text = "Tall people are so clumsy."
predictions = classifier(text)

# Print predictions
for pred in sorted(predictions[0], key=lambda x: x['score'], reverse=True)[:5]:
    print(f"{pred['label']}: {pred['score']:.3f}")

# Output:
# physical: 1.000
# socioeconomic: 0.002
# gender: 0.002
# racial: 0.001
# age: 0.001

How the Model Was Created

The model was fine-tuned for bias detection using the following hyperparameters:

Learning Rate: 3e-5
Batch Size: 16
Weight Decay: 0.01
Warmup Steps: 500
Optimizer: AdamW
Evaluation Metrics: Precision, Recall, F1 Score (weighted), Accuracy

Dataset

The synthetic dataset consists of biased statements and questions generated by Mistal 7B as part of the GUS-Net paper. It covers 11 bias categories:

Racial
Religious
Gender
Age
Nationality
Sexuality
Socioeconomic
Educational
Disability
Political
Physical

Evaluation Results

The model was evaluated on the synthetic dataset’s test split. The overall metrics using a threshold of 0.5 are as follows:

Macro Averages:

Metric	Value
Accuracy	0.983
Precision	0.930
Recall	0.914
F1	0.921
MCC	0.912

Per-Label Results:

Label	Accuracy	Precision	Recall	F1	MCC	Support	Threshold
Racial	0.975	0.871	0.889	0.880	0.866	388	0.5
Religious	0.994	0.962	0.970	0.966	0.962	335	0.5
Gender	0.976	0.930	0.925	0.927	0.913	615	0.5
Age	0.990	0.964	0.931	0.947	0.941	375	0.5
Nationality	0.972	0.924	0.881	0.902	0.886	554	0.5
Sexuality	0.993	0.960	0.957	0.958	0.955	301	0.5
Socioeconomic	0.964	0.909	0.818	0.861	0.842	516	0.5
Educational	0.982	0.873	0.933	0.902	0.893	330	0.5
Disability	0.986	0.923	0.887	0.905	0.897	283	0.5
Political	0.988	0.958	0.938	0.948	0.941	438	0.5
Physical	0.993	0.961	0.920	0.940	0.936	238	0.5

Intended Use

The model is designed to detect and classify bias in text across 11 categories. It can be used in applications such as:

Content moderation
Bias analysis in research
Ethical AI development

Limitations and Biases

Synthetic Nature: The dataset consists of synthetic text, which may not fully represent real-world biases.
Category Overlap: Certain biases may overlap, leading to challenges in precise classification.
Domain-Specific Generalization: The model may not generalize well to domains outside the synthetic dataset’s scope.

Environmental Impact

Hardware Used: NVIDIA RTX4090
Training Time: ~2 hours
Carbon Emissions: ~0.08 kg CO2 (calculated via ML CO2 Impact Calculator).

Citation

If you use this model, please cite it as follows:

@inproceedings{JunquedeFortuny2025c,
  title = {Bias Detection with ModernBERT-Large},
  author = {Enric Junqué de Fortuny},
  year = {2025},
  howpublished = {\url{https://huggingface.co/cirimus/modernbert-large-bias-type-classifier}},
}

cirimus
/

modernbert-large-bias-type-classifier