Daniele Licari
commited on
Commit
·
69d53a2
1
Parent(s):
a7523ac
Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ with a language modeling head on top, AdamW Optimizer, initial learning rate 5e-
|
|
19 |
linear learning rate decay, ends at 2.525e-9), sequence length 512, batch size 10 (imposed
|
20 |
by GPU capacity), 8.4 million training steps, device 1*GPU V100 16GB
|
21 |
|
22 |
-
|
23 |
|
24 |
ITALIAN-LEGAL-BERT model can be loaded like:
|
25 |
|
|
|
19 |
linear learning rate decay, ends at 2.525e-9), sequence length 512, batch size 10 (imposed
|
20 |
by GPU capacity), 8.4 million training steps, device 1*GPU V100 16GB
|
21 |
|
22 |
+
<h2> Usage </h2>
|
23 |
|
24 |
ITALIAN-LEGAL-BERT model can be loaded like:
|
25 |
|