Text Classification
Transformers
PyTorch
English
deberta
hate-speech-detection
Inference Endpoints
HannahRoseKirk commited on
Commit
dda52b6
1 Parent(s): 1e97121

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -5
README.md CHANGED
@@ -49,17 +49,36 @@ python3 transformers/examples/pytorch/text-classification/run_glue.py \
49
  ```
50
 
51
  We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
52
- forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':1, 'R1':5, 'R2':100, 'R3':1, 'R4':1 , 'R5':100, 'R6':1, 'R7':5}.
53
 
54
  ## Variable and metrics
55
- We evaluate the model based on:
56
  * [HatemojiCheck](https://huggingface.co/datasets/HannahRoseKirk/HatemojiCheck), an evaluation checklist with 7 functionalities of emoji-based hate and contrast sets
57
  * [HateCheck](https://huggingface.co/datasets/Paul/hatecheck), an evaluation checklist contains 29 functional tests for hate speech and contrast sets.
58
- * The held-out tests sets from the three round of adversarially-generated data collection with emoji-containing examples (R5-7).
59
- * The held-out test sets from the four rounds of adversarially-generated data collection with text-only examples (R1-4, from Vidgen et al.)
60
 
61
  For the round-specific test sets, we used a weighted F1-score across them to choose the final model in each round. For more details, see our [paper](https://arxiv.org/abs/2108.05921)
62
 
63
  ## Evaluation results
64
- For full evaluation of the model, see our [paper](https://arxiv.org/abs/2108.05921).
 
 
 
 
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ```
50
 
51
  We experimented with upsampling the train split of each round to improve performance with increments of [1, 5, 10, 100], with the optimum upsampling taken
52
+ forward to all subsequent rounds. The optimal upsampling ratios for R1-R4 (text rounds from Vidgen et al.,) are carried forward. This model is trained on upsampling ratios of `{'R0':1, 'R1':5, 'R2':100, 'R3':1, 'R4':1 , 'R5':100, 'R6':1, 'R7':5}`.
53
 
54
  ## Variable and metrics
55
+ We wished to train a model which could effectively encode information about emoji-based hate, without worsening performance on text-only hate. Thus, we evaluate the model on:
56
  * [HatemojiCheck](https://huggingface.co/datasets/HannahRoseKirk/HatemojiCheck), an evaluation checklist with 7 functionalities of emoji-based hate and contrast sets
57
  * [HateCheck](https://huggingface.co/datasets/Paul/hatecheck), an evaluation checklist contains 29 functional tests for hate speech and contrast sets.
58
+ * The held-out tests sets from [HatemojiBuild](https://huggingface.co/datasets/HannahRoseKirk/HatemojiBuild) the three round of adversarially-generated data collection with emoji-containing examples (R5-7). Available on Huuggingface
59
+ * The held-out test sets from the four rounds of adversarially-generated data collection with text-only examples (R1-4, from [Vidgen et al.](https://github.com/bvidgen/Dynamically-Generated-Hate-Speech-Dataset))
60
 
61
  For the round-specific test sets, we used a weighted F1-score across them to choose the final model in each round. For more details, see our [paper](https://arxiv.org/abs/2108.05921)
62
 
63
  ## Evaluation results
64
+ We compare our model in each iteration (R6-T, R7-T, R8-T) to:
65
+ * **P-IA**: the identity attack attribute from Perspective API
66
+ * **P-TX**: the toxicity attribute from Perspective API
67
+ * **B-D**: A BERT model trained on the [Davidson et al. (2017)](https://github.com/t-davidson/hate-speech-and-offensive-language) dataset
68
+ * **B-F**: A BERT model trained on the [Founta et al. (2018)](https://github.com/ENCASEH2020/hatespeech-twitter) dataset
69
 
70
+ | | **Emoji Test Sets** | | | | **Text Test Sets** | | | | **All Rounds** | |
71
+ | :------- | :-----------------: | :--------: | :------------: | :--------: | :----------------: | :----: | :-----------: | :--------: | :------------: | :--------: |
72
+ | | **R5-R7** | | **HmojiCheck** | | **R1-R4** | | **HateCheck** | | **R1-R7** | |
73
+ | | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** | **Acc** | **F1** |
74
+ | **P-IA** | 0\.508 | 0\.394 | 0\.689 | 0\.754 | 0\.679 | 0\.720 | 0\.765 | 0\.839 | 0\.658 | 0\.689 |
75
+ | **P-TX** | 0\.523 | 0\.448 | 0\.650 | 0\.711 | 0\.602 | 0\.659 | 0\.720 | 0\.813 | 0\.592 | 0\.639 |
76
+ | **B-D** | 0\.489 | 0\.270 | 0\.578 | 0\.636 | 0\.589 | 0\.607 | 0\.632 | 0\.738 | 0\.591 | 0\.586 |
77
+ | **B-F** | 0\.496 | 0\.322 | 0\.552 | 0\.605 | 0\.562 | 0\.562 | 0\.602 | 0\.694 | 0\.557 | 0\.532 |
78
+ | **R6-T** | 0\.757 | **0\.769** | **0\.879** | **0\.910** | 0\.823 | 0\.837 | 0\.961 | 0\.971 | 0\.813 | 0\.825 |
79
+ | **R7-T** | **0\.759** | 0\.762 | 0\.867 | 0\.899 | 0\.824 | 0\.842 | 0\.955 | 0\.967 | 0\.813 | 0\.829 |
80
+ | **R8-T** | 0\.744 | 0\.755 | 0\.871 | 0\.904 | 0\.827 | 0\.844 | **0\.966** | **0\.975** | **0\.814** | **0\.829** |
81
+
82
+ For full discussion of the model results, see our [paper](https://arxiv.org/abs/2108.05921).
83
+
84
+ A recent [paper][https://arxiv.org/pdf/2202.11176.pdf] _A New Generation of Perspective API:Efficient Multilingual Character-level Transformers_ beats this model on the HatemojiCheck benchmark.