Edit model card

General concept of the model

This model is trained on the dataset of sensitive topics of the Russian language. The concept of sensitive topics is described in this article presented at the workshop for Balto-Slavic NLP at the EACL-2021 conference. Please note that this article describes the first version of the dataset, while the model is trained on the extended version of the dataset open-sourced on our GitHub or on kaggle. The properties of the dataset is the same as the one described in the article, the only difference is the size.

Instructions

The model predicts combinations of 18 sensitive topics described in the article. You can find step-by-step instructions for using the model here

Metrics

The dataset partially manually labeled samples and partially semi-automatically labeled samples. Learn more in our article. We tested the performance of the classifier only on the part of manually labeled data that is why some topics are not well represented in the test set.

precision recall f1-score support
offline_crime 0.65 0.55 0.6 132
online_crime 0.5 0.46 0.48 37
drugs 0.87 0.9 0.88 87
gambling 0.5 0.67 0.57 6
pornography 0.73 0.59 0.65 204
prostitution 0.75 0.69 0.72 91
slavery 0.72 0.72 0.73 40
suicide 0.33 0.29 0.31 7
terrorism 0.68 0.57 0.62 47
weapons 0.89 0.83 0.86 138
body_shaming 0.9 0.67 0.77 109
health_shaming 0.84 0.55 0.66 108
politics 0.68 0.54 0.6 241
racism 0.81 0.59 0.68 204
religion 0.94 0.72 0.81 102
sexual_minorities 0.69 0.46 0.55 102
sexism 0.66 0.64 0.65 132
social_injustice 0.56 0.37 0.45 181
none 0.62 0.67 0.64 250
micro avg 0.72 0.61 0.66 2218
macro avg 0.7 0.6 0.64 2218
weighted avg 0.73 0.61 0.66 2218
samples avg 0.75 0.66 0.68 2218

Licensing Information

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Citation

If you find this repository helpful, feel free to cite our publication:

@inproceedings{babakov-etal-2021-detecting,
    title = "Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company{'}s Reputation",
    author = "Babakov, Nikolay  and
      Logacheva, Varvara  and
      Kozlova, Olga  and
      Semenov, Nikita  and
      Panchenko, Alexander",
    booktitle = "Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing",
    month = apr,
    year = "2021",
    address = "Kiyv, Ukraine",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.bsnlp-1.4",
    pages = "26--36",
    abstract = "Not all topics are equally {``}flammable{''} in terms of toxicity: a calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labelling a dataset for appropriateness. While toxicity in user-generated data is well-studied, we aim at defining a more fine-grained notion of inappropriateness. The core of inappropriateness is that it can harm the reputation of a speaker. This is different from toxicity in two respects: (i) inappropriateness is topic-related, and (ii) inappropriate message is not toxic but still unacceptable. We collect and release two datasets for Russian: a topic-labelled dataset and an appropriateness-labelled dataset. We also release pre-trained classification models trained on this data.",
}
Downloads last month
1,053
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.