BenjaminOcampo commited on
Commit
6218ef6
·
verified ·
1 Parent(s): 53c7d9a

Add model's weights

Browse files
Files changed (7) hide show
  1. README.md +179 -85
  2. config.json +28 -0
  3. model.pt +3 -0
  4. special_tokens_map.json +7 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +16 -0
  7. vocab.txt +0 -0
README.md CHANGED
@@ -1,116 +1,210 @@
1
  ---
2
- base_model: BenjaminOcampo/model-contrastive-hatebert__trained-in-ishate__seed-0
3
- datasets:
4
- - ISHate
5
- language:
6
- - en
7
- library_name: transformers
8
- license: bsl-1.0
9
- metrics:
10
- - f1
11
- - accuracy
12
- tags:
13
- - hate-speech-detection
14
- - implicit-hate-speech
15
  ---
16
 
17
- This model card documents the demo paper "PEACE: Providing Explanations and
18
- Analysis for Combating Hate Expressions" accepted at the 27th European
19
- Conference on Artificial Intelligence: https://www.ecai2024.eu/calls/demos.
20
 
21
- # The Model
22
- This model is a hate speech detector fine-tuned specifically for detecting
23
- implicit hate speech. It is based on the paper "PEACE: Providing Explanations
24
- and Analysis for Combating Hate Expressions" by Greta Damo, Nicolás Benjamín
25
- Ocampo, Elena Cabrio, and Serena Villata, presented at the 27th European
26
- Conference on Artificial Intelligence.
27
 
28
- # Training Parameters and Experimental Info
29
- The model was trained using the ISHate dataset, focusing on implicit data.
30
- Training parameters included:
31
- - Batch size: 32
32
- - Weight decay: 0.01
33
- - Epochs: 4
34
- - Learning rate: 2e-5
35
 
36
- For detailed information on the training process, please refer to the [model's
37
- paper](https://aclanthology.org/2023.findings-emnlp.441/).
38
 
39
- # Usage
40
 
41
- First you might need the transformers version 4.30.2.
42
 
 
 
 
43
  ```
44
- pip install transformers==4.30.2
 
 
 
 
 
 
 
45
  ```
46
 
47
- This model was created using pytorch vanilla. In order to load it you have to use the following Model Class.
48
 
49
- ```python
50
- class ContrastiveModel(nn.Module):
51
- def __init__(self, model):
52
- super(ContrastiveModel, self).__init__()
53
- self.model = model
54
- self.embedding_dim = model.config.hidden_size
55
- self.fc = nn.Linear(self.embedding_dim, self.embedding_dim)
56
- self.classifier = nn.Linear(self.embedding_dim, 2) # Classification layer
57
 
58
- def forward(self, input_ids, attention_mask):
59
- outputs = self.model(input_ids, attention_mask)
60
- embeddings = outputs.last_hidden_state[:, 0] # Use the CLS token embedding as the representation
61
- embeddings = self.fc(embeddings)
62
- logits = self.classifier(embeddings) # Apply classification layer
63
 
64
- return embeddings, logits
65
- ```
66
 
67
- Then, we instantiate the model as:
 
 
68
 
69
- ```python
70
- from transformers import AutoModel, AutoTokenizer, AutoConfig
71
 
72
- repo_name = "BenjaminOcampo/peace_cont_hatebert"
73
 
74
- config = AutoConfig.from_pretrained(repo_name)
75
- contrastive_model = ContrastiveModel(AutoModel.from_config(config))
76
- tokenizer = AutoTokenizer.from_pretrained(repo_name)
77
- ```
78
 
79
- Finally, to load the weights of the model we do as follows:
80
 
81
- ```python
82
- model_tmp_file = hf_hub_download(repo_id=repo_name, filename="model.pt", token=read_token)
83
 
84
- state_dict = torch.load(model_tmp_file)
85
 
86
- contrastive_model.load_state_dict(state_dict)
87
- ```
88
 
89
- You can make predictions as any pytorch model:
90
 
91
- ```python
92
- import torch
93
 
94
- text = "Are you sure that Islam is a peaceful religion?"
95
- inputs = tokenizer(text, return_tensors="pt")
96
 
97
- with torch.no_grad():
98
- _, logits = contrastive_model(inputs["input_ids"], inputs["attention_mask"])
99
 
100
- probabilities = torch.softmax(logits, dim=1)
101
- _, predicted_labels = torch.max(probabilities, dim=1)
102
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
- # Datasets
105
- The model was trained on the [ISHate dataset](https://huggingface.co/datasets/BenjaminOcampo/ISHate), specifically
106
- the training part of the dataset which focuses on implicit hate speech.
107
 
108
- # Evaluation Results
109
- The model's performance was evaluated using standard metrics, including F1 score
110
- and accuracy. For comprehensive evaluation results, refer to the linked paper.
111
 
112
- Authors:
113
- - [Greta Damo](https://grexit-d.github.io/damo.greta.github.io/)
114
- - [Nicolás Benjamín Ocampo](https://www.nicolasbenjaminocampo.com/)
115
- - [Elena Cabrio](https://www-sop.inria.fr/members/Elena.Cabrio/)
116
- - [Serena Villata](https://webusers.i3s.unice.fr/~villata/Home.html)
 
1
  ---
2
+ language: en
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
+ # Model Card for BenjaminOcampo/model-contrastive-hatebert__trained-in-ishate__seed-0
 
 
6
 
7
+ <!-- Provide a quick summary of what the model is/does. -->
 
 
 
 
 
8
 
 
 
 
 
 
 
 
9
 
 
 
10
 
11
+ ## Model Details
12
 
13
+ ### Model Description
14
 
15
+ <!-- Provide a longer summary of what this model is. -->
16
+
17
+ **Classification results test set**
18
  ```
19
+ precision recall f1-score support
20
+
21
+ Non-HS 0.9139 0.8351 0.8727 2681
22
+ HS 0.7696 0.8749 0.8189 1687
23
+
24
+ accuracy 0.8505 4368
25
+ macro avg 0.8417 0.8550 0.8458 4368
26
+ weighted avg 0.8581 0.8505 0.8519 4368
27
  ```
28
 
 
29
 
30
+ - **Developed by:** Benjamin Ocampo
31
+ - **Shared by [optional]:** [More Information Needed]
32
+ - **Model type:** [More Information Needed]
33
+ - **Language(s) (NLP):** en
34
+ - **License:** [More Information Needed]
35
+ - **Finetuned from model [optional]:** [More Information Needed]
 
 
36
 
37
+ ### Model Sources [optional]
 
 
 
 
38
 
39
+ <!-- Provide the basic links for the model. -->
 
40
 
41
+ - **Repository:** https://github.com/huggingface/huggingface_hub
42
+ - **Paper [optional]:** [More Information Needed]
43
+ - **Demo [optional]:** [More Information Needed]
44
 
45
+ ## Uses
 
46
 
47
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
48
 
49
+ ### Direct Use
 
 
 
50
 
51
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
52
 
53
+ [More Information Needed]
 
54
 
55
+ ### Downstream Use [optional]
56
 
57
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
58
 
59
+ [More Information Needed]
60
 
61
+ ### Out-of-Scope Use
 
62
 
63
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
64
 
65
+ [More Information Needed]
 
66
 
67
+ ## Bias, Risks, and Limitations
68
+
69
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
70
+
71
+ [More Information Needed]
72
+
73
+ ### Recommendations
74
+
75
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
76
+
77
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
78
+
79
+ ## How to Get Started with the Model
80
+
81
+ Use the code below to get started with the model.
82
+
83
+ [More Information Needed]
84
+
85
+ ## Training Details
86
+
87
+ ### Training Data
88
+
89
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
90
+
91
+ [More Information Needed]
92
+
93
+ ### Training Procedure
94
+
95
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
96
+
97
+ #### Preprocessing [optional]
98
+
99
+ [More Information Needed]
100
+
101
+
102
+ #### Training Hyperparameters
103
+
104
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
105
+
106
+ #### Speeds, Sizes, Times [optional]
107
+
108
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
109
+
110
+ [More Information Needed]
111
+
112
+ ## Evaluation
113
+
114
+ <!-- This section describes the evaluation protocols and provides the results. -->
115
+
116
+ ### Testing Data, Factors & Metrics
117
+
118
+ #### Testing Data
119
+
120
+ <!-- This should link to a Data Card if possible. -->
121
+
122
+ [More Information Needed]
123
+
124
+ #### Factors
125
+
126
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
127
+
128
+ [More Information Needed]
129
+
130
+ #### Metrics
131
+
132
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
133
+
134
+ [More Information Needed]
135
+
136
+ ### Results
137
+
138
+ [More Information Needed]
139
+
140
+ #### Summary
141
+
142
+
143
+
144
+ ## Model Examination [optional]
145
+
146
+ <!-- Relevant interpretability work for the model goes here -->
147
+
148
+ [More Information Needed]
149
+
150
+ ## Environmental Impact
151
+
152
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
153
+
154
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
155
+
156
+ - **Hardware Type:** [More Information Needed]
157
+ - **Hours used:** [More Information Needed]
158
+ - **Cloud Provider:** [More Information Needed]
159
+ - **Compute Region:** [More Information Needed]
160
+ - **Carbon Emitted:** [More Information Needed]
161
+
162
+ ## Technical Specifications [optional]
163
+
164
+ ### Model Architecture and Objective
165
+
166
+ [More Information Needed]
167
+
168
+ ### Compute Infrastructure
169
+
170
+ [More Information Needed]
171
+
172
+ #### Hardware
173
+
174
+ [More Information Needed]
175
+
176
+ #### Software
177
+
178
+ [More Information Needed]
179
+
180
+ ## Citation [optional]
181
+
182
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
183
+
184
+ **BibTeX:**
185
+
186
+ [More Information Needed]
187
+
188
+ **APA:**
189
+
190
+ [More Information Needed]
191
+
192
+ ## Glossary [optional]
193
+
194
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
195
+
196
+ [More Information Needed]
197
+
198
+ ## More Information [optional]
199
+
200
+ [More Information Needed]
201
+
202
+ ## Model Card Authors [optional]
203
+
204
+ [More Information Needed]
205
+
206
+ ## Model Card Contact
207
 
208
+ [More Information Needed]
 
 
209
 
 
 
 
210
 
 
 
 
 
 
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BenjaminOcampo/model-hatebert__trained-in-ishate__seed-0",
3
+ "_num_labels": 2,
4
+ "architectures": [
5
+ "BertForSequenceClassification"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "classifier_dropout": null,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "output_past": true,
20
+ "pad_token_id": 0,
21
+ "position_embedding_type": "absolute",
22
+ "problem_type": "single_label_classification",
23
+ "torch_dtype": "float32",
24
+ "transformers_version": "4.27.4",
25
+ "type_vocab_size": 2,
26
+ "use_cache": true,
27
+ "vocab_size": 30522
28
+ }
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ab1235605b6535716e7350c287f2c155751dc18fd6c4fa6f23e5202f0ad42f0
3
+ size 440370701
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "do_basic_tokenize": true,
4
+ "do_lower_case": true,
5
+ "mask_token": "[MASK]",
6
+ "max_len": 512,
7
+ "model_max_length": 512,
8
+ "never_split": null,
9
+ "pad_token": "[PAD]",
10
+ "sep_token": "[SEP]",
11
+ "special_tokens_map_file": "/home/nocampo/.cache/huggingface/hub/models--GroNLP--hateBERT/snapshots/f56d507e4b6a64413aff29e541e1b2178ee79d67/special_tokens_map.json",
12
+ "strip_accents": null,
13
+ "tokenize_chinese_chars": true,
14
+ "tokenizer_class": "BertTokenizer",
15
+ "unk_token": "[UNK]"
16
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff