jarodrigues
commited on
Commit
•
243d308
1
Parent(s):
347e432
Update README.md
Browse files
README.md
CHANGED
@@ -22,17 +22,17 @@ datasets:
|
|
22 |
</br>
|
23 |
</br>
|
24 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
25 |
-
<p style="text-align: center;"> This is the model card for Gervásio 7B
|
26 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
27 |
</p>
|
28 |
</br>
|
29 |
</br>
|
30 |
|
31 |
-
# Gervásio 7B
|
32 |
|
33 |
</br>
|
34 |
|
35 |
-
**Gervásio PT
|
36 |
|
37 |
|
38 |
It is a **decoder** of the LLaMA family, based on the neural architecture Transformer and developed over the LLaMA-2 7B model.
|
@@ -45,19 +45,21 @@ namely for the European variant, spoken in Portugal ([**gervasio-7b-portuguese-p
|
|
45 |
All versions of Gervásio are **openly distributed for free under an open license**, including thus for research and commercial purposes, and given its size, can
|
46 |
be run on consumer-grade hardware.
|
47 |
|
48 |
-
**Gervásio 7B
|
49 |
|
50 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
51 |
-
and which is known more shortly as Gervásio PT
|
52 |
|
53 |
-
These models are fully documented in the respective [publication](https://arxiv.org/abs
|
54 |
|
55 |
``` latex
|
56 |
@misc{gervasio,
|
57 |
-
title={Advancing Generative AI for Portuguese with
|
58 |
-
|
|
|
|
|
59 |
year={2024},
|
60 |
-
eprint={
|
61 |
archivePrefix={arXiv},
|
62 |
primaryClass={cs.CL}
|
63 |
}
|
@@ -71,16 +73,16 @@ Please use the above cannonical reference when using or citing this model.
|
|
71 |
|
72 |
# Model Description
|
73 |
|
74 |
-
**This model card is for Gervásio 7B
|
75 |
|
76 |
-
Gervásio
|
77 |
|
78 |
|
79 |
<br>
|
80 |
|
81 |
# Training Data
|
82 |
|
83 |
-
**Gervásio 7B
|
84 |
|
85 |
|
86 |
We selected those datasets where the outcome of their machine translation into European Portuguese could preserve, in the target language, the linguistic properties at stake.
|
|
|
22 |
</br>
|
23 |
</br>
|
24 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
25 |
+
<p style="text-align: center;"> This is the model card for Gervásio 7B PTPT Decoder.
|
26 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
27 |
</p>
|
28 |
</br>
|
29 |
</br>
|
30 |
|
31 |
+
# Gervásio 7B PTPT
|
32 |
|
33 |
</br>
|
34 |
|
35 |
+
**Gervásio PT*** is a **fully open** decoder for the **Portuguese language**.
|
36 |
|
37 |
|
38 |
It is a **decoder** of the LLaMA family, based on the neural architecture Transformer and developed over the LLaMA-2 7B model.
|
|
|
45 |
All versions of Gervásio are **openly distributed for free under an open license**, including thus for research and commercial purposes, and given its size, can
|
46 |
be run on consumer-grade hardware.
|
47 |
|
48 |
+
**Gervásio 7B PTPT** is developed by NLX-Natural Language and Speech Group, at the University of Lisbon, Faculty of Sciences, Department of Informatics, Portugal.
|
49 |
|
50 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
51 |
+
and which is known more shortly as **Gervásio PT*** or, even more briefly, just as **Gervásio**, among its acquaintances.
|
52 |
|
53 |
+
These models are fully documented in the respective [publication](https://arxiv.org/abs/2402.18766):
|
54 |
|
55 |
``` latex
|
56 |
@misc{gervasio,
|
57 |
+
title={Advancing Generative AI for Portuguese with
|
58 |
+
Open Decoder Gervásio PT-*},
|
59 |
+
author={Rodrigo Santos, João Silva, Luís Gomes,
|
60 |
+
João Rodrigues, António Branco},
|
61 |
year={2024},
|
62 |
+
eprint={2402.18766},
|
63 |
archivePrefix={arXiv},
|
64 |
primaryClass={cs.CL}
|
65 |
}
|
|
|
73 |
|
74 |
# Model Description
|
75 |
|
76 |
+
**This model card is for Gervásio 7B PTPT**, with 7 billion parameters, a hidden size of 4,096 units, an intermediate size of 11,008 units, 32 attention heads, 32 hidden layers, and a tokenizer obtained using the Byte-Pair Encoding (BPE) algorithm implemented with SentencePiece, featuring a vocabulary size of 32,000.
|
77 |
|
78 |
+
Gervásio 7B PTPT is distributed under an [MIT license](https://huggingface.co/PORTULAN/gervasio-7b-portuguese-ptbr-decoder/blob/main/LICENSE).
|
79 |
|
80 |
|
81 |
<br>
|
82 |
|
83 |
# Training Data
|
84 |
|
85 |
+
**Gervásio 7B PTPT** was trained over standard supervised fine-tuning, and to keep some alignment with mainstream benchmarks for English, we resorted to tasks and respective datasets in the GLUE and the SuperGLUE collections.
|
86 |
|
87 |
|
88 |
We selected those datasets where the outcome of their machine translation into European Portuguese could preserve, in the target language, the linguistic properties at stake.
|