d1mitriz
commited on
Commit
•
37958ac
1
Parent(s):
13361ad
Improved model card
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ datasets:
|
|
9 |
metrics:
|
10 |
- accuracy
|
11 |
model-index:
|
12 |
-
- name: greek-longformer-base-4096
|
13 |
results:
|
14 |
- task:
|
15 |
name: Masked Language Modeling
|
@@ -27,14 +27,23 @@ model-index:
|
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
should probably proofread and complete it, then remove this comment. -->
|
29 |
|
30 |
-
#
|
31 |
|
32 |
-
|
|
|
|
|
33 |
It achieves the following results on the evaluation set:
|
34 |
|
35 |
- Loss: 1.1080
|
36 |
- Accuracy: 0.7765
|
37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
## Model description
|
39 |
|
40 |
More information needed
|
|
|
9 |
metrics:
|
10 |
- accuracy
|
11 |
model-index:
|
12 |
+
- name: greek-longformer-base-4096
|
13 |
results:
|
14 |
- task:
|
15 |
name: Masked Language Modeling
|
|
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
28 |
should probably proofread and complete it, then remove this comment. -->
|
29 |
|
30 |
+
# Greek Longformer
|
31 |
|
32 |
+
A Greek version of the Longformer Language Model.
|
33 |
+
|
34 |
+
This model is a (from scratch) Greek Longformer model based on the configuration of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096), and trained on the combined datasets from the [Greek Wikipedia](https://huggingface.co/datasets/wikipedia) and the Greek part of [OSCAR](https://huggingface.co/datasets/oscar-corpus/OSCAR-2301).
|
35 |
It achieves the following results on the evaluation set:
|
36 |
|
37 |
- Loss: 1.1080
|
38 |
- Accuracy: 0.7765
|
39 |
|
40 |
+
## Pre-training corpora
|
41 |
+
|
42 |
+
The pre-training corpora of `greek-longformer-base-4096` include:
|
43 |
+
|
44 |
+
- The Greek part of [Wikipedia](https://el.wikipedia.org/wiki/Βικιπαίδεια:Αντίγραφα_της_βάσης_δεδομένων),
|
45 |
+
- The Greek part of [OSCAR](https://traces1.inria.fr/oscar/), a cleansed version of [Common Crawl](https://commoncrawl.org).
|
46 |
+
|
47 |
## Model description
|
48 |
|
49 |
More information needed
|