|
--- |
|
language: en |
|
tags: |
|
- text-classification |
|
- pytorch |
|
- ModernBERT |
|
- bias |
|
- multi-class-classification |
|
- multi-label-classification |
|
datasets: |
|
- synthetic-biased-corpus |
|
license: mit |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- precision |
|
- recall |
|
- matthews_correlation |
|
base_model: |
|
- answerdotai/ModernBERT-large |
|
widget: |
|
- text: Women are bad at math. |
|
library_name: transformers |
|
--- |
|
|
|
![banner](https://huggingface.co/cirimus/modernbert-large-bias-type-classifier/resolve/main/banner.png) |
|
|
|
### Overview |
|
|
|
This model was fine-tuned from [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on a synthetic dataset of biased statements and questions, generated by Mistal 7B as part of the [GUS-Net paper](https://huggingface.co/papers/2410.08388). The model is designed to identify and classify text bias into multiple categories, including racial, religious, gender, age, and other biases, making it a valuable tool for bias detection and mitigation in natural language processing tasks. |
|
|
|
--- |
|
|
|
### Model Details |
|
|
|
- **Base Model**: [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) |
|
- **Fine-Tuning Dataset**: Synthetic biased corpus |
|
- **Number of Labels**: 11 |
|
- **Problem Type**: Multi-label classification |
|
- **Language**: English |
|
- **License**: [MIT](https://opensource.org/licenses/MIT) |
|
- **Fine-Tuning Framework**: Hugging Face Transformers |
|
|
|
--- |
|
|
|
### Example Usage |
|
|
|
Here’s how to use the model with Hugging Face Transformers: |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Load the model |
|
classifier = pipeline( |
|
"text-classification", |
|
model="cirimus/modernbert-large-bias-type-classifier", |
|
return_all_scores=True |
|
) |
|
|
|
text = "Tall people are so clumsy." |
|
predictions = classifier(text) |
|
|
|
# Print predictions |
|
for pred in sorted(predictions[0], key=lambda x: x['score'], reverse=True)[:5]: |
|
print(f"{pred['label']}: {pred['score']:.3f}") |
|
|
|
# Output: |
|
# physical: 1.000 |
|
# socioeconomic: 0.002 |
|
# gender: 0.002 |
|
# racial: 0.001 |
|
# age: 0.001 |
|
``` |
|
|
|
--- |
|
|
|
### How the Model Was Created |
|
|
|
The model was fine-tuned for bias detection using the following hyperparameters: |
|
|
|
- **Learning Rate**: `3e-5` |
|
- **Batch Size**: 16 |
|
- **Weight Decay**: `0.01` |
|
- **Warmup Steps**: 500 |
|
- **Optimizer**: AdamW |
|
- **Evaluation Metrics**: Precision, Recall, F1 Score (weighted), Accuracy |
|
|
|
--- |
|
|
|
### Dataset |
|
|
|
The synthetic dataset consists of biased statements and questions generated by Mistal 7B as part of the GUS-Net paper. It covers 11 bias categories: |
|
|
|
1. Racial |
|
2. Religious |
|
3. Gender |
|
4. Age |
|
5. Nationality |
|
6. Sexuality |
|
7. Socioeconomic |
|
8. Educational |
|
9. Disability |
|
10. Political |
|
11. Physical |
|
|
|
--- |
|
|
|
### Evaluation Results |
|
|
|
The model was evaluated on the synthetic dataset’s test split. The overall metrics using a threshold of `0.5` are as follows: |
|
|
|
#### Macro Averages: |
|
|
|
| Metric | Value | |
|
|--------------|--------| |
|
| Accuracy | 0.983 | |
|
| Precision | 0.930 | |
|
| Recall | 0.914 | |
|
| F1 | 0.921 | |
|
| MCC | 0.912 | |
|
|
|
#### Per-Label Results: |
|
|
|
| Label | Accuracy | Precision | Recall | F1 | MCC | Support | Threshold | |
|
|----------------|----------|-----------|--------|-------|-------|---------|-----------| |
|
| Racial | 0.975 | 0.871 | 0.889 | 0.880 | 0.866 | 388 | 0.5 | |
|
| Religious | 0.994 | 0.962 | 0.970 | 0.966 | 0.962 | 335 | 0.5 | |
|
| Gender | 0.976 | 0.930 | 0.925 | 0.927 | 0.913 | 615 | 0.5 | |
|
| Age | 0.990 | 0.964 | 0.931 | 0.947 | 0.941 | 375 | 0.5 | |
|
| Nationality | 0.972 | 0.924 | 0.881 | 0.902 | 0.886 | 554 | 0.5 | |
|
| Sexuality | 0.993 | 0.960 | 0.957 | 0.958 | 0.955 | 301 | 0.5 | |
|
| Socioeconomic | 0.964 | 0.909 | 0.818 | 0.861 | 0.842 | 516 | 0.5 | |
|
| Educational | 0.982 | 0.873 | 0.933 | 0.902 | 0.893 | 330 | 0.5 | |
|
| Disability | 0.986 | 0.923 | 0.887 | 0.905 | 0.897 | 283 | 0.5 | |
|
| Political | 0.988 | 0.958 | 0.938 | 0.948 | 0.941 | 438 | 0.5 | |
|
| Physical | 0.993 | 0.961 | 0.920 | 0.940 | 0.936 | 238 | 0.5 | |
|
|
|
--- |
|
|
|
### Intended Use |
|
|
|
The model is designed to detect and classify bias in text across 11 categories. It can be used in applications such as: |
|
|
|
- Content moderation |
|
- Bias analysis in research |
|
- Ethical AI development |
|
|
|
--- |
|
|
|
### Limitations and Biases |
|
|
|
- **Synthetic Nature**: The dataset consists of synthetic text, which may not fully represent real-world biases. |
|
- **Category Overlap**: Certain biases may overlap, leading to challenges in precise classification. |
|
- **Domain-Specific Generalization**: The model may not generalize well to domains outside the synthetic dataset’s scope. |
|
|
|
--- |
|
|
|
### Environmental Impact |
|
|
|
- **Hardware Used**: NVIDIA RTX4090 |
|
- **Training Time**: ~2 hours |
|
- **Carbon Emissions**: ~0.08 kg CO2 (calculated via [ML CO2 Impact Calculator](https://mlco2.github.io/impact)). |
|
|
|
--- |
|
|
|
### Citation |
|
|
|
If you use this model, please cite it as follows: |
|
|
|
```bibtex |
|
@inproceedings{JunquedeFortuny2025c, |
|
title = {Bias Detection with ModernBERT-Large}, |
|
author = {Enric Junqué de Fortuny}, |
|
year = {2025}, |
|
howpublished = {\url{https://huggingface.co/cirimus/modernbert-large-bias-type-classifier}}, |
|
} |
|
``` |
|
|
|
|