jarodrigues
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -125,7 +125,6 @@ In other words, each example occupies the full input sequence length.
|
|
125 |
|
126 |
For testing, we reserved the translated datasets MRPC (similarity) and RTE (inference), from GLUE, and COPA (reasoning/qa), from SuperGLUE, which were taken as representatives of three major types of tasks, and were not seen during training.
|
127 |
|
128 |
-
|
129 |
| Model | MRPC (F1) | RTE (F1) | COPA (F1) |
|
130 |
|--------------------------|----------------|----------------|-----------|
|
131 |
| **Gervásio 7B PTBR** | **0.7822** | **0.8321** | 0.2134 |
|
@@ -135,7 +134,6 @@ For testing, we reserved the translated datasets MRPC (similarity) and RTE (infe
|
|
135 |
|
136 |
For further testing our decoder, in addition to the testing data described above, we also used datasets that were originally developed with texts from Portuguese: ASSIN2 RTE (entailment) and ASSIN2 STS (similarity), BLUEX (question answering), ENEM 2022 (question answering) and FaQuAD (extractive question-answering).
|
137 |
|
138 |
-
|
139 |
| Model | ENEM 2022 (Accuracy) | BLUEX (Accuracy)| RTE (F1) | STS (Pearson) |
|
140 |
|--------------------------|----------------------|-----------------|-----------|---------------|
|
141 |
| **Gervásio 7B PTBR** | 0.1977 | 0.2640 | **0.7469**| **0.2136** |
|
|
|
125 |
|
126 |
For testing, we reserved the translated datasets MRPC (similarity) and RTE (inference), from GLUE, and COPA (reasoning/qa), from SuperGLUE, which were taken as representatives of three major types of tasks, and were not seen during training.
|
127 |
|
|
|
128 |
| Model | MRPC (F1) | RTE (F1) | COPA (F1) |
|
129 |
|--------------------------|----------------|----------------|-----------|
|
130 |
| **Gervásio 7B PTBR** | **0.7822** | **0.8321** | 0.2134 |
|
|
|
134 |
|
135 |
For further testing our decoder, in addition to the testing data described above, we also used datasets that were originally developed with texts from Portuguese: ASSIN2 RTE (entailment) and ASSIN2 STS (similarity), BLUEX (question answering), ENEM 2022 (question answering) and FaQuAD (extractive question-answering).
|
136 |
|
|
|
137 |
| Model | ENEM 2022 (Accuracy) | BLUEX (Accuracy)| RTE (F1) | STS (Pearson) |
|
138 |
|--------------------------|----------------------|-----------------|-----------|---------------|
|
139 |
| **Gervásio 7B PTBR** | 0.1977 | 0.2640 | **0.7469**| **0.2136** |
|