Model Card for LLaVA-1.6-Mistral-7B-Offensive-Meme-Singapore
This model is described in the paper Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models. It classifies memes as offensive or not offensive, specifically within the Singaporean context.
Model Details
This model is a fine-tuned Vision-Language Model (VLM) designed to detect offensive memes in the Singaporean context. It leverages the strengths of VLMs to handle the nuanced and culturally specific nature of meme interpretation, addressing the limitations of traditional content moderation systems. The model was fine-tuned on a dataset of 112K memes labeled by GPT-4V. The fine-tuning process involved a pipeline incorporating OCR, translation, and a 7-billion parameter VLM (LLaVA-v1.6-Mistral-7b-hf). The resulting model demonstrates strong performance in offensive meme detection, achieving high accuracy and AUROC scores on a held-out test set.
- Developed by: Cao Yuxuan, Wu Jiayang, Alistair Cheong Liang Chuen, Bryan Shan Guanrong, Theodore Lee Chong Jen, and Sherman Chann Zhi Shen
- Model type: Fine-tuned Vision-Language Model (VLM)
- Language(s) (NLP): English (with multilingual capabilities through the pipeline)
- License: MIT
- Finetuned from model: llava-hf/llava-v1.6-mistral-7b-hf
- Repository: https://github.com/aliencaocao/vlm-for-memes-aisg
- Paper: Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models
Uses
Direct Use
The model can be used directly for classifying memes as offensive or non-offensive. Input is expected to be a meme image. The model processes this using OCR and translation where necessary, then utilizes a VLM for classification.
Downstream Use
This model can be integrated into larger content moderation systems to enhance the detection of offensive memes, specifically targeting the Singaporean context.
Out-of-Scope Use
This model is specifically trained for the Singaporean context. Its performance may degrade significantly when applied to memes from other cultures or regions. It is also not suitable for general-purpose image classification tasks.
Bias, Risks, and Limitations
The model's performance is inherently tied to the quality and representativeness of the training data. Biases present in the training data may be reflected in the model's output, particularly regarding the interpretation of culturally specific humor or references. The model may misclassify memes due to ambiguities in language or visual representation. It is crucial to use this model responsibly and acknowledge its limitations.
Recommendations
Users should be aware of the potential biases and limitations of the model. Human review of the model's output is strongly recommended, especially in high-stakes scenarios. Further research into mitigating bias and enhancing robustness is needed.
How to Get Started with the Model
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
[More Information Needed]
Model Examination
[More Information Needed]
Environmental Impact
[More Information Needed]
Technical Specifications
[More Information Needed]
Citation
@misc{yuxuan2025detectingoffensivememessocial,
title={Detecting Offensive Memes with Social Biases in Singapore Context Using Multimodal Large Language Models},
author={Cao Yuxuan and Wu Jiayang and Alistair Cheong Liang Chuen and Bryan Shan Guanrong and Theodore Lee Chong Jen and Sherman Chann Zhi Shen},
year={2025},
eprint={2502.18101},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.18101},
}
Glossary
[More Information Needed]
More Information
[More Information Needed]
Model Card Authors
[More Information Needed]
Model Card Contact
[More Information Needed]
- Downloads last month
- 36
Model tree for aliencaocao/llava-1.6-mistral-7b-offensive-meme-singapore
Base model
llava-hf/llava-v1.6-mistral-7b-hfDataset used to train aliencaocao/llava-1.6-mistral-7b-offensive-meme-singapore
Evaluation results
- AUROC on Offensive Memes in Singapore Contexttest set self-reported0.735
- Accuracy on Offensive Memes in Singapore Contexttest set self-reported0.726