Daniele Licari
commited on
Commit
·
8f704e3
1
Parent(s):
c3b687e
Update README.md
Browse files
README.md
CHANGED
@@ -10,3 +10,11 @@ widget:
|
|
10 |
|
11 |
ITALIAN-LEGAL-BERT is based on <a href="https://huggingface.co/dbmdz/bert-base-italian-xxl-cased">bert-base-italian-xxl-cased</a> with additional pre-training of the Italian BERT model on Italian civil law corpora.
|
12 |
It achieves better results than the ‘general-purpose’ Italian BERT in different domain-specific tasks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
ITALIAN-LEGAL-BERT is based on <a href="https://huggingface.co/dbmdz/bert-base-italian-xxl-cased">bert-base-italian-xxl-cased</a> with additional pre-training of the Italian BERT model on Italian civil law corpora.
|
12 |
It achieves better results than the ‘general-purpose’ Italian BERT in different domain-specific tasks.
|
13 |
+
|
14 |
+
<h2>Training procedure</h2>
|
15 |
+
We initialized ITALIAN-LEGAL-BERT with ITALIAN XXL BERT
|
16 |
+
and pretrained for an additional 4 epochs on 3.7 GB of text from the National Jurisprudential
|
17 |
+
Archive using the Huggingface PyTorch-Transformers library. We used BERT architecture
|
18 |
+
with a language modeling head on top, AdamW Optimizer, initial learning rate 5e-5 (with
|
19 |
+
linear learning rate decay, ends at 2.525e-9), sequence length 512, batch size 10 (imposed
|
20 |
+
by GPU capacity), 8.4 million training steps, device 1*GPU V100 16GB
|