kornosk commited on
Commit
06a264e
·
1 Parent(s): 3718c07

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ tags:
4
+ - twitter
5
+ - masked-token-prediction
6
+ - election2020
7
+ license: "gpl-3.0"
8
+ ---
9
+
10
+ # Pre-trained BERT on Twitter US Political Election 2020
11
+
12
+ Pre-trained weights for [Knowledge Enhance Masked Language Model for Stance Detection](https://2021.naacl.org/program/accepted/), NAACL 2021.
13
+
14
+ # Training Data
15
+
16
+ This model is pre-trained on over 5 million English tweets about the 2020 US Presidential Election.
17
+
18
+ # Training Objective
19
+
20
+ This model is initialized with BERT-base and trained with normal MLM objective.
21
+
22
+ # Usage
23
+
24
+ This pre-trained language model **can be fine-tunned to any downstream task (e.g. classification)**.
25
+
26
+ Please see the [official repository](https://github.com/GU-DataLab/stance-detection-KE-MLM) for more detail.
27
+
28
+ ```python
29
+ from transformers import BertTokenizer, BertForMaskedLM, pipeline
30
+ import torch
31
+
32
+ # choose GPU if available
33
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
34
+
35
+ # select mode path here
36
+ pretrained_LM_path = "kornosk/bert-political-election2020-twitter-mlm"
37
+
38
+ # load model
39
+ tokenizer = BertTokenizer.from_pretrained(pretrained_LM_path)
40
+ model = BertForMaskedLM.from_pretrained(pretrained_LM_path)
41
+
42
+ # fill mask
43
+ example = "Trump is the [MASK] of USA"
44
+ fill_mask = pipeline('fill-mask', model=model, tokenizer=tokenizer)
45
+
46
+ outputs = fill_mask(example)
47
+ print(outputs)
48
+
49
+ # see embeddings
50
+ inputs = tokenizer(example, return_tensors="pt")
51
+ outputs = model(**inputs)
52
+ print(outputs)
53
+
54
+ # OR you can use this model to train on your downstream task!
55
+ # please consider citing our paper if you feel this is useful :)
56
+ ```
57
+
58
+ # Reference
59
+
60
+ - [Knowledge Enhance Masked Language Model for Stance Detection](https://2021.naacl.org/program/accepted/), NAACL 2021.
61
+
62
+ # Citation
63
+ ```bibtex
64
+ @inproceedings{kawintiranon2021knowledge,
65
+ title={Knowledge Enhanced Masked Language Model for Stance Detection},
66
+ author={Kawintiranon, Kornraphop and Singh, Lisa},
67
+ booktitle={Proceedings of the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
68
+ year={2021},
69
+ url={#}
70
+ }
71
+ ```