File size: 1,876 Bytes
820263b
878f515
 
8de53f6
878f515
 
 
 
8de53f6
 
878f515
820263b
 
878f515
820263b
878f515
 
820263b
e362dc6
 
820263b
 
878f515
 
 
 
 
 
 
 
9e65cd5
 
878f515
 
820263b
878f515
820263b
878f515
 
 
 
 
820263b
878f515
 
 
820263b
878f515
9e65cd5
 
 
 
 
 
 
878f515
820263b
 
 
878f515
820263b
73abc83
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
language:
- en
license: mit
datasets:
- cardiffnlp/x_sensitive
metrics:
- f1
widget:
- text: Call me today to earn some money mofos!
pipeline_tag: text-classification
---

# twitter-roberta-base-sensitive-binary

This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for detecting sensitive content (multilabel classification) on the [_X-Sensitive_](https://huggingface.co/datasets/cardiffnlp/x_sensitive) dataset.
The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-2022-154m).

A sensitive content binary model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-sensitive-binary).



## Labels
```
"id2label": {
  "0": "conflictual",
  "1": "profanity",
  "2": "sex",
  "3": "drugs",
  "4": "selfharm",
  "5": "spam",
  "6": "not-sensitive"
}
```

## Full classification example

```python
from transformers import pipeline
    
pipe = pipeline(model='cardiffnlp/twitter-roberta-large-sensitive-multilabel')
text = "Call me today to earn some money mofos!"

pipe(text)
```
Output: 

```
[[{'label': 'conflictual', 'score': 0.03700090944766998},
  {'label': 'profanity', 'score': 0.9770461916923523},
  {'label': 'sex', 'score': 0.01981434039771557},
  {'label': 'drugs', 'score': 0.017757439985871315},
  {'label': 'selfharm', 'score': 0.008804548531770706},
  {'label': 'spam', 'score': 0.07784222811460495},
  {'label': 'not-sensitive', 'score': 0.010364986956119537}]]
```



## BibTeX entry and citation info

```
@article{antypas2024sensitive,
  title={Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation},
  author={Antypas, Dimosthenis and Sen, Indira and Perez-Almendros, Carla and Camacho-Collados, Jose and Barbieri, Francesco},
  journal={arXiv preprint arXiv:2411.19832},
  year={2024}
}
```