File size: 831 Bytes
48228b4
 
0eadc89
48228b4
 
0eadc89
 
48228b4
 
0eadc89
48228b4
0eadc89
48228b4
 
 
0eadc89
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# PrivBERT 
PrivBERT is a privacy policy language model. We pre-trained PrivBERT on ~1 million privacy policies starting with the pretrained Roberta model. The data is available at [https://privaseer.ist.psu.edu/data](https://privaseer.ist.psu.edu/data)

## Usage
```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("mukund/privbert")
model = AutoModel.from_pretrained("mukund/privbert")
```

## License 
If you use this dataset in research, you must cite the below paper.
```
Mukund Srinath, Shomir Wilson and C. Lee Giles. Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies. In Proc. ACL 2021.
```
For research, teaching, and scholarship purposes, the model is available under a CC BY-NC-SA license. Please contact us for any requests regarding commercial use.