julien-c HF staff commited on
Commit
555a2b0
1 Parent(s): 034495f

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/DJSammy/bert-base-danish-uncased_BotXO,ai/README.md

Files changed (1) hide show
  1. README.md +142 -0
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: da
3
+ tags:
4
+ - bert
5
+ - masked-lm
6
+ license: cc-by-4.0
7
+ datasets:
8
+ - common_crawl
9
+ - wikipedia
10
+ pipeline_tag: fill-mask
11
+ widget:
12
+ - text: "København er [MASK] i Danmark."
13
+ ---
14
+
15
+ # Danish BERT (uncased) model
16
+
17
+ [BotXO.ai](https://www.botxo.ai/) developed this model. For data and training details see their [GitHub repository](https://github.com/botxo/nordic_bert).
18
+
19
+ The original model was trained in TensorFlow then I converted it to Pytorch using [transformers-cli](https://huggingface.co/transformers/converting_tensorflow_models.html?highlight=cli).
20
+
21
+ For TensorFlow version download here: https://www.dropbox.com/s/19cjaoqvv2jicq9/danish_bert_uncased_v2.zip?dl=1
22
+
23
+
24
+ ## Architecture
25
+
26
+ ```python
27
+ from transformers import AutoModelForPreTraining
28
+
29
+ model = AutoModelForPreTraining.from_pretrained("DJSammy/bert-base-danish-uncased_BotXO,ai")
30
+
31
+ params = list(model.named_parameters())
32
+ print('danish_bert_uncased_v2 has {:} different named parameters.\n'.format(len(params)))
33
+
34
+ print('==== Embedding Layer ====\n')
35
+ for p in params[0:5]:
36
+ print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
37
+
38
+ print('\n==== First Transformer ====\n')
39
+ for p in params[5:21]:
40
+ print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
41
+
42
+ print('\n==== Last Transformer ====\n')
43
+ for p in params[181:197]:
44
+ print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
45
+
46
+ print('\n==== Output Layer ====\n')
47
+ for p in params[197:]:
48
+ print("{:<55} {:>12}".format(p[0], str(tuple(p[1].size()))))
49
+
50
+ # danish_bert_uncased_v2 has 206 different named parameters.
51
+
52
+ # ==== Embedding Layer ====
53
+
54
+ # bert.embeddings.word_embeddings.weight (32000, 768)
55
+ # bert.embeddings.position_embeddings.weight (512, 768)
56
+ # bert.embeddings.token_type_embeddings.weight (2, 768)
57
+ # bert.embeddings.LayerNorm.weight (768,)
58
+ # bert.embeddings.LayerNorm.bias (768,)
59
+
60
+ # ==== First Transformer ====
61
+
62
+ # bert.encoder.layer.0.attention.self.query.weight (768, 768)
63
+ # bert.encoder.layer.0.attention.self.query.bias (768,)
64
+ # bert.encoder.layer.0.attention.self.key.weight (768, 768)
65
+ # bert.encoder.layer.0.attention.self.key.bias (768,)
66
+ # bert.encoder.layer.0.attention.self.value.weight (768, 768)
67
+ # bert.encoder.layer.0.attention.self.value.bias (768,)
68
+ # bert.encoder.layer.0.attention.output.dense.weight (768, 768)
69
+ # bert.encoder.layer.0.attention.output.dense.bias (768,)
70
+ # bert.encoder.layer.0.attention.output.LayerNorm.weight (768,)
71
+ # bert.encoder.layer.0.attention.output.LayerNorm.bias (768,)
72
+ # bert.encoder.layer.0.intermediate.dense.weight (3072, 768)
73
+ # bert.encoder.layer.0.intermediate.dense.bias (3072,)
74
+ # bert.encoder.layer.0.output.dense.weight (768, 3072)
75
+ # bert.encoder.layer.0.output.dense.bias (768,)
76
+ # bert.encoder.layer.0.output.LayerNorm.weight (768,)
77
+ # bert.encoder.layer.0.output.LayerNorm.bias (768,)
78
+
79
+ # ==== Last Transformer ====
80
+
81
+ # bert.encoder.layer.11.attention.self.query.weight (768, 768)
82
+ # bert.encoder.layer.11.attention.self.query.bias (768,)
83
+ # bert.encoder.layer.11.attention.self.key.weight (768, 768)
84
+ # bert.encoder.layer.11.attention.self.key.bias (768,)
85
+ # bert.encoder.layer.11.attention.self.value.weight (768, 768)
86
+ # bert.encoder.layer.11.attention.self.value.bias (768,)
87
+ # bert.encoder.layer.11.attention.output.dense.weight (768, 768)
88
+ # bert.encoder.layer.11.attention.output.dense.bias (768,)
89
+ # bert.encoder.layer.11.attention.output.LayerNorm.weight (768,)
90
+ # bert.encoder.layer.11.attention.output.LayerNorm.bias (768,)
91
+ # bert.encoder.layer.11.intermediate.dense.weight (3072, 768)
92
+ # bert.encoder.layer.11.intermediate.dense.bias (3072,)
93
+ # bert.encoder.layer.11.output.dense.weight (768, 3072)
94
+ # bert.encoder.layer.11.output.dense.bias (768,)
95
+ # bert.encoder.layer.11.output.LayerNorm.weight (768,)
96
+ # bert.encoder.layer.11.output.LayerNorm.bias (768,)
97
+
98
+ # ==== Output Layer ====
99
+
100
+ # bert.pooler.dense.weight (768, 768)
101
+ # bert.pooler.dense.bias (768,)
102
+ # cls.predictions.bias (32000,)
103
+ # cls.predictions.transform.dense.weight (768, 768)
104
+ # cls.predictions.transform.dense.bias (768,)
105
+ # cls.predictions.transform.LayerNorm.weight (768,)
106
+ # cls.predictions.transform.LayerNorm.bias (768,)
107
+ # cls.seq_relationship.weight (2, 768)
108
+ # cls.seq_relationship.bias (2,)
109
+ ```
110
+
111
+ ## Example Pipeline
112
+
113
+ ```python
114
+ from transformers import pipeline
115
+ unmasker = pipeline('fill-mask', model='DJSammy/bert-base-danish-uncased_BotXO,ai')
116
+
117
+ unmasker('København er [MASK] i Danmark.')
118
+
119
+ # Copenhagen is the [MASK] of Denmark.
120
+ # =>
121
+
122
+ # [{'score': 0.788068950176239,
123
+ # 'sequence': '[CLS] københavn er hovedstad i danmark. [SEP]',
124
+ # 'token': 12610,
125
+ # 'token_str': 'hovedstad'},
126
+ # {'score': 0.07606703042984009,
127
+ # 'sequence': '[CLS] københavn er hovedstaden i danmark. [SEP]',
128
+ # 'token': 8108,
129
+ # 'token_str': 'hovedstaden'},
130
+ # {'score': 0.04299738258123398,
131
+ # 'sequence': '[CLS] københavn er metropol i danmark. [SEP]',
132
+ # 'token': 23305,
133
+ # 'token_str': 'metropol'},
134
+ # {'score': 0.008163209073245525,
135
+ # 'sequence': '[CLS] københavn er ikke i danmark. [SEP]',
136
+ # 'token': 89,
137
+ # 'token_str': 'ikke'},
138
+ # {'score': 0.006238455418497324,
139
+ # 'sequence': '[CLS] københavn er ogsa i danmark. [SEP]',
140
+ # 'token': 25253,
141
+ # 'token_str': 'ogsa'}]
142
+ ```