jarodrigues
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -20,39 +20,40 @@ tags:
|
|
20 |
- portuguese
|
21 |
- decoder
|
22 |
- foundation model
|
23 |
-
- instruct
|
24 |
datasets:
|
25 |
- PORTULAN/glue-ptpt
|
26 |
---
|
27 |
</br>
|
28 |
</br>
|
29 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
30 |
-
<p style="text-align: center;"> This is the model card for Gervásio 7B PT-PT
|
31 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
32 |
</p>
|
33 |
</br>
|
34 |
</br>
|
35 |
|
36 |
-
# Gervásio 7B PT-PT
|
37 |
|
38 |
-
|
|
|
|
|
39 |
|
40 |
|
41 |
-
It is a **decoder** of the
|
42 |
Its further improvement through additional training was done over language resources that include new instruction data sets of Portuguese prepared for this purpose.
|
43 |
|
44 |
It has different versions that were trained for different variants of Portuguese (PT),
|
45 |
-
namely the European variant
|
46 |
|
47 |
-
All versions of Gervásio are **distributed for free
|
48 |
-
be run on consumer-grade hardware
|
49 |
|
50 |
-
**Gervásio PT-PT
|
51 |
|
52 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
53 |
and which is know tough more shortly as **Gervásio PT-***, or even more briefly just as **Gervásio**, among his acquaintances.
|
54 |
|
55 |
-
|
56 |
|
57 |
``` latex
|
58 |
@misc{albertina-pt,
|
@@ -73,15 +74,15 @@ Please use the above cannonical reference when using or citing this model.
|
|
73 |
|
74 |
# Model Description
|
75 |
|
76 |
-
**This model card is for Gervásio
|
77 |
-
Gervásio-7B-PTPT-
|
78 |
|
79 |
|
80 |
<br>
|
81 |
|
82 |
# Training Data
|
83 |
|
84 |
-
**Gervásio
|
85 |
|
86 |
|
87 |
We selected those datasets where the outcome of their machine translation into Portuguese could preserve, in the target language, the linguistic properties at stake.
|
|
|
20 |
- portuguese
|
21 |
- decoder
|
22 |
- foundation model
|
|
|
23 |
datasets:
|
24 |
- PORTULAN/glue-ptpt
|
25 |
---
|
26 |
</br>
|
27 |
</br>
|
28 |
<img align="left" width="40" height="40" src="https://github.githubassets.com/images/icons/emoji/unicode/1f917.png">
|
29 |
+
<p style="text-align: center;"> This is the model card for Gervásio 7B PT-PT Decoder.
|
30 |
You may be interested in some of the other models in the <a href="https://huggingface.co/PORTULAN">Albertina (encoders) and Gervásio (decoders) families</a>.
|
31 |
</p>
|
32 |
</br>
|
33 |
</br>
|
34 |
|
35 |
+
# Gervásio 7B PT-PT
|
36 |
|
37 |
+
</br>
|
38 |
+
|
39 |
+
**Gervásio PT-*** is a **fully open** decoder for the **Portuguese language**.
|
40 |
|
41 |
|
42 |
+
It is a **decoder** of the LLaMA family, based on the neural architecture Transformer and developed over the LLaMA~2 7B model.
|
43 |
Its further improvement through additional training was done over language resources that include new instruction data sets of Portuguese prepared for this purpose.
|
44 |
|
45 |
It has different versions that were trained for different variants of Portuguese (PT),
|
46 |
+
namely the European variant, spoken in Portugal (**PT-PT**), and the American variant, spoken in Brazil (**PT-BR**).
|
47 |
|
48 |
+
All versions of Gervásio are **openly distributed for free under an open license**, including thus for research and commercial purposes, and given its size, can
|
49 |
+
be run on consumer-grade hardware.
|
50 |
|
51 |
+
**Gervásio 7B PT-PT** is developed by NLX-Natural Language and Speech Group, at the University of Lisbon, Faculty of Sciences, Department of Informatics, Portugal.
|
52 |
|
53 |
For the record, its full name is **Gervásio Produz Textos em Português**, to which corresponds the natural acronym **GPT PT**,
|
54 |
and which is know tough more shortly as **Gervásio PT-***, or even more briefly just as **Gervásio**, among his acquaintances.
|
55 |
|
56 |
+
These models are fully documented in the respective [publication](https://arxiv.org/abs/?):
|
57 |
|
58 |
``` latex
|
59 |
@misc{albertina-pt,
|
|
|
74 |
|
75 |
# Model Description
|
76 |
|
77 |
+
**This model card is for Gervásio 7B PT-PT**, with 7 billion parameters, a hidden size of 4096 units, an intermediate size of 11,008 units, 32 attention heads, 32 hidden layers, and a tokenizer obtained using the Byte-Pair Encoding (BPE) algorithm implemented with SentencePiece, featuring a vocabulary size of 32,000.
|
78 |
+
Gervásio-7B-PTPT-Decoder is distributed under an [MIT license](https://huggingface.co/PORTULAN/albertina-ptpt/blob/main/LICENSE).
|
79 |
|
80 |
|
81 |
<br>
|
82 |
|
83 |
# Training Data
|
84 |
|
85 |
+
**Gervásio 7B PT-PT** over standard supervised fine-tuning, and to keep some alignment with mainstream benchmarks for English, we resorted to tasks and respective datasets in the GLUE and the SuperGLUE collections.
|
86 |
|
87 |
|
88 |
We selected those datasets where the outcome of their machine translation into Portuguese could preserve, in the target language, the linguistic properties at stake.
|