GuiltRoBERTa-en: A Two-Stage Classifier for Guilt-Assignment Rhetoric in English Political Texts
GuiltRoBERTa-en is a two-stage AI pipeline for detecting guilt-assignment rhetoric in English political discourse. It combines:
- Stage 1 – Emotion Pre-Filtering: emotion labels from the Babel Emotions6 Tool
- Stage 2 – Guilt Classification: a fine-tuned binary XLM-RoBERTa model trained on manually annotated English texts (
guiltvsno_guilt)
The approach is grounded in political communication theory, which suggests that guilt attribution often emerges in anger-laden contexts. Thus, only texts labeled as "Anger" in Stage 1 are passed to the guilt classifier.
🧩 Model Architecture
Stage 1: Emotion Pre-Filtering (Babel Emotions Tool)
- Tool: Emotions 6 Babel Machine
- Task: 6-class emotion classification (
Anger,Fear,Disgust,Sadness,Joy,None of them) - Input: CSV file with one text per row
- Output: CSV file with predicted labels and probabilities
- Usage: retain only rows with
predicted_emotion == "Anger"for Stage 2
The Babel Emotions Tool is not an API but a web-based interface. Upload a CSV file, download the labeled results, and use them as input to the guilt classifier.
Stage 2: Guilt Classification
- Base model:
xlm-roberta-base - Task: Binary classification (
guilt,no_guilt) - Training data: Sentence-level annotated English corpus
- Optimization: Class-weighted loss function to handle label imbalance
- Recommended threshold: Ï„ = 0.15
Motivation
Guilt assignment — attributing moral responsibility or blame — is a key rhetorical strategy in political communication. Since guilt often appears alongside anger, direct one-stage classification risks conflating emotional tones.
This two-stage pipeline improves precision by:
- Filtering anger-related contexts first
- Then applying a dedicated guilt detector only where relevant
Evaluation
The model was evaluated on a held-out validation set (20% stratified split) with the following approach:
| Stage 1 Filter | Threshold (Ï„) | Precision | Recall | F1 | Accuracy |
|---|---|---|---|---|---|
| Anger-only | 0.15 | optimized | optimized | optimized | optimized |
- Best configuration: Anger-only, Ï„ = 0.15
- Metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC, PR-AUC
- The two-stage model shows improved performance compared to single-stage baselines
Usage Example
Step 1: Get Emotion Predictions from Babel
- Visit https://emotionsbabel.poltextlab.com/
- Upload your CSV file (one text per row)
- Download the predictions (includes
emotion_predictedcolumn)
Step 2: Apply Guilt Classifier
import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
# Load Babel emotion predictions
df = pd.read_excel("your_data_with_emotion_predictions.xlsx")
# Filter for 'Anger' only
anger_df = df[df["emotion_predicted"] == "Anger"].copy()
# Load the guilt classifier
repo_id = "your-org/guiltroberta-en" # Update with actual path
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True)
# Apply guilt predictions with threshold
THRESHOLD = 0.15
anger_df["guilt_score"] = anger_df["text"].apply(
lambda t: pipe(t)[0][1]["score"] # score for 'guilt' label
)
anger_df["guilt_predicted"] = anger_df["guilt_score"] > THRESHOLD
# Save results
anger_df.to_excel("anger_with_guilt_predictions.xlsx", index=False)
# Statistics
print(f"Total anger sentences: {len(anger_df)}")
print(f"Predicted guilt: {anger_df['guilt_predicted'].sum()}")
print(f"Guilt ratio: {anger_df['guilt_predicted'].mean():.2%}")
Alternative: Direct Inference
import torch
from transformers import XLMRobertaTokenizer, XLMRobertaForSequenceClassification
# Load model
model_path = "your-org/guiltroberta-en"
tokenizer = XLMRobertaTokenizer.from_pretrained(model_path)
model = XLMRobertaForSequenceClassification.from_pretrained(model_path)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
# Example: anger-labeled sentence
text = "I'm furious at myself for letting this happen again."
# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512, padding=True)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
prob_guilt = torch.softmax(logits, dim=-1)[0][1].item()
# Apply threshold
THRESHOLD = 0.15
prediction = "guilt" if prob_guilt > THRESHOLD else "no_guilt"
print(f"Guilt probability: {prob_guilt:.4f}")
print(f"Prediction: {prediction}")
Training Configuration
Epochs: 4
Learning Rate: 2e-5
Batch Size: 8
Max Sequence Length: 512 tokens
Optimizer: AdamW
Scheduler: Linear warmup
Train/Validation Split: 80/20 (stratified)
Class Weighting: Applied to handle label imbalance
- Downloads last month
- 22