jarodrigues
commited on
Commit
•
159dafc
1
Parent(s):
c337dee
Update README.md
Browse files
README.md
CHANGED
@@ -22,13 +22,13 @@ datasets:
|
|
22 |
</br>
|
23 |
</br>
|
24 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
25 |
-
<p style="text-align: center;"> This is the model card for Gervásio 7B
|
26 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
27 |
</p>
|
28 |
</br>
|
29 |
</br>
|
30 |
|
31 |
-
# Gervásio 7B
|
32 |
|
33 |
</br>
|
34 |
|
@@ -45,7 +45,7 @@ namely for the European variant, spoken in Portugal ([**gervasio-7b-portuguese-p
|
|
45 |
All versions of Gervásio are **openly distributed for free under an open license**, including thus for research and commercial purposes, and given its size, can
|
46 |
be run on consumer-grade hardware.
|
47 |
|
48 |
-
**Gervásio 7B
|
49 |
|
50 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
51 |
and which is known more shortly as **Gervásio PT-*** or, even more briefly, just as **Gervásio**, among its acquaintances.
|
@@ -73,7 +73,7 @@ Please use the above cannonical reference when using or citing this model.
|
|
73 |
|
74 |
# Model Description
|
75 |
|
76 |
-
**This model card is for Gervásio 7B
|
77 |
|
78 |
Gervásio-7B-PTBR-Decoder is distributed under an [MIT license](https://huggingface.co/PORTULAN/gervasio-7b-portuguese-ptpt-decoder/blob/main/LICENSE).
|
79 |
|
@@ -82,7 +82,7 @@ Gervásio-7B-PTBR-Decoder is distributed under an [MIT license](https://huggingf
|
|
82 |
|
83 |
# Training Data
|
84 |
|
85 |
-
**Gervásio 7B
|
86 |
|
87 |
|
88 |
We selected those datasets where the outcome of their machine translation into American Portuguese could preserve, in the target language, the linguistic properties at stake.
|
@@ -128,7 +128,7 @@ This involves repurposing the tasks in various ways, such as generation of answe
|
|
128 |
|
129 |
| Model | MRPC (F1) | RTE (F1) | COPA (F1) |
|
130 |
|--------------------------|----------------|----------------|-----------|
|
131 |
-
| **Gervásio 7B
|
132 |
| **LLaMA-2** | 0.0369 | 0.0516 | 0.4867 |
|
133 |
| **LLaMA-2 Chat** | 0.5432 | 0.3807 | **0.5493**|
|
134 |
<br>
|
@@ -144,7 +144,7 @@ To evaluate Gervásio, the examples were randomly selected to be included in the
|
|
144 |
|
145 |
| Model | ENEM 2022 (Accuracy) | BLUEX (Accuracy)| RTE (F1) | STS (Pearson) |
|
146 |
|--------------------------|----------------------|-----------------|-----------|---------------|
|
147 |
-
| **Gervásio 7B
|
148 |
| **LLaMA-2** | 0.2458 | 0.2903 | 0.0913 | 0.1034 |
|
149 |
| **LLaMA-2 Chat** | 0.2231 | 0.2959 | 0.5546 | 0.1750 |
|
150 |
||||||
|
|
|
22 |
</br>
|
23 |
</br>
|
24 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
25 |
+
<p style="text-align: center;"> This is the model card for Gervásio 7B PTBR Decoder.
|
26 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
27 |
</p>
|
28 |
</br>
|
29 |
</br>
|
30 |
|
31 |
+
# Gervásio 7B PTBR
|
32 |
|
33 |
</br>
|
34 |
|
|
|
45 |
All versions of Gervásio are **openly distributed for free under an open license**, including thus for research and commercial purposes, and given its size, can
|
46 |
be run on consumer-grade hardware.
|
47 |
|
48 |
+
**Gervásio 7B PTBR** is developed by NLX-Natural Language and Speech Group, at the University of Lisbon, Faculty of Sciences, Department of Informatics, Portugal.
|
49 |
|
50 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
51 |
and which is known more shortly as **Gervásio PT-*** or, even more briefly, just as **Gervásio**, among its acquaintances.
|
|
|
73 |
|
74 |
# Model Description
|
75 |
|
76 |
+
**This model card is for Gervásio 7B PTBR**, with 7 billion parameters, a hidden size of 4096 units, an intermediate size of 11,008 units, 32 attention heads, 32 hidden layers, and a tokenizer obtained using the Byte-Pair Encoding (BPE) algorithm implemented with SentencePiece, featuring a vocabulary size of 32,000.
|
77 |
|
78 |
Gervásio-7B-PTBR-Decoder is distributed under an [MIT license](https://huggingface.co/PORTULAN/gervasio-7b-portuguese-ptpt-decoder/blob/main/LICENSE).
|
79 |
|
|
|
82 |
|
83 |
# Training Data
|
84 |
|
85 |
+
**Gervásio 7B PTBR** was trained over standard supervised fine-tuning, and to keep some alignment with mainstream benchmarks for English, we resorted to tasks and respective datasets in the GLUE and the SuperGLUE collections.
|
86 |
|
87 |
|
88 |
We selected those datasets where the outcome of their machine translation into American Portuguese could preserve, in the target language, the linguistic properties at stake.
|
|
|
128 |
|
129 |
| Model | MRPC (F1) | RTE (F1) | COPA (F1) |
|
130 |
|--------------------------|----------------|----------------|-----------|
|
131 |
+
| **Gervásio 7B PTBR** | **0.7822** | **0.8321** | 0.2134 |
|
132 |
| **LLaMA-2** | 0.0369 | 0.0516 | 0.4867 |
|
133 |
| **LLaMA-2 Chat** | 0.5432 | 0.3807 | **0.5493**|
|
134 |
<br>
|
|
|
144 |
|
145 |
| Model | ENEM 2022 (Accuracy) | BLUEX (Accuracy)| RTE (F1) | STS (Pearson) |
|
146 |
|--------------------------|----------------------|-----------------|-----------|---------------|
|
147 |
+
| **Gervásio 7B PTBR** | 0.1977 | 0.2640 | **0.7469**| **0.2136** |
|
148 |
| **LLaMA-2** | 0.2458 | 0.2903 | 0.0913 | 0.1034 |
|
149 |
| **LLaMA-2 Chat** | 0.2231 | 0.2959 | 0.5546 | 0.1750 |
|
150 |
||||||
|