Model Card for uvegesistvan/wildmann_german_proposal_2b
Model Overview
This model is a multi-class emotion classifier trained on German-to-English machine-translated text data. It identifies nine distinct emotional states in text. The model leverages the strengths of a diverse dataset, balancing both synthetic and original German sentences translated into English, emphasizing its ability to generalize across linguistic and cultural variations introduced by machine translation.
Emotion Classes
The model classifies the following emotional states:
- Anger (0)
- Fear (1)
- Disgust (2)
- Sadness (3)
- Joy (4)
- Enthusiasm (5)
- Hope (6)
- Pride (7)
- No emotion (8)
Dataset and Preprocessing
The dataset consists of German text machine-translated into English and annotated for emotional content. It includes both synthetic and original sentences to enhance diversity. P reprocessing involved:
- Undersampling of overrepresented classes, such as "No emotion" and "Anger," to ensure balanced training across all labels.
Evaluation Metrics
The model's performance was evaluated using precision, recall, F1-score, and accuracy metrics. Detailed results are as follows:
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
Anger (0) | 0.54 | 0.58 | 0.56 | 777 |
Fear (1) | 0.79 | 0.79 | 0.79 | 776 |
Disgust (2) | 0.96 | 0.93 | 0.94 | 776 |
Sadness (3) | 0.86 | 0.83 | 0.84 | 775 |
Joy (4) | 0.82 | 0.82 | 0.82 | 777 |
Enthusiasm (5) | 0.64 | 0.61 | 0.63 | 776 |
Hope (6) | 0.51 | 0.57 | 0.54 | 777 |
Pride (7) | 0.71 | 0.82 | 0.76 | 776 |
No emotion (8) | 0.70 | 0.62 | 0.66 | 1553 |
Overall Metrics
- Accuracy: 0.72
- Macro Average: Precision = 0.73, Recall = 0.73, F1-Score = 0.73
- Weighted Average: Precision = 0.72, Recall = 0.72, F1-Score = 0.72
Performance Insights
The model shows high accuracy in detecting "Disgust" and "Fear," while "Hope" and "Enthusiasm" demonstrate slightly lower performance due to subtle nuances or translation noise. These results highlight the trade-offs involved when training on machine-translated text.
Model Usage
Applications
- Emotion analysis of German texts by leveraging English machine translation as an intermediary step.
- Research on cross-linguistic emotion classification in multilingual datasets.
- Sentiment analysis for social media or user feedback originally in German.
Limitations
- The model's performance is influenced by the quality of the machine-translated text. Translation errors or omissions could lead to misclassifications.
- Subtle emotional expressions may not always translate effectively, potentially introducing inaccuracies in classification.
Ethical Considerations
The use of machine-translated datasets may lead to biases or inaccuracies due to linguistic and cultural nuances being lost during translation. Users should carefully evaluate the model before applying it to sensitive domains such as mental health, social research, or customer sentiment analysis.
Citation
For further information, visit: uvegesistvan/wildmann_german_proposal_2b
- Downloads last month
- 10