TheMrguiller
commited on
Commit
•
c297461
1
Parent(s):
1b1ee38
Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,15 @@
|
|
2 |
language:
|
3 |
- en
|
4 |
pipeline_tag: text-classification
|
|
|
|
|
|
|
|
|
5 |
---
|
6 |
-
It is a model
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
pipeline_tag: text-classification
|
5 |
+
license: mit
|
6 |
+
metrics:
|
7 |
+
- accuracy
|
8 |
+
- f1
|
9 |
---
|
10 |
+
This model is part of the research presented in "Mitigating Toxicity in Dialogue Agents through Adversarial Reinforcement Learning," a conference paper addressing dialog agent toxicity by mitigating it at three levels: explicit, implicit, and contextual. It is a model capable of predicting toxicity given a history and a response to it. It is designed for dialog agents. To use it correctly, please follow the schematics below:
|
11 |
+
|
12 |
+
[HST]Hi, how are you?[END]I am doing fine[ANS]I hope you die.
|
13 |
+
|
14 |
+
The token [HST] initiates the history of the conversation, and each turn pair is separated by [END]. The token [ANS] indicates the start of the response to the last utterance. I will update this card, but right now, I am developing a bigger project with these, so I do not have the time to indicate all the results.
|
15 |
+
|
16 |
+
The datasets used to train the model were the Dialogue Safety dataset and Bot Adversarial dataset.
|