PORTULAN
/

gervasio-7b-portuguese-ptbr-decoder

Model card Files Files and versions Community

jarodrigues commited on Mar 1, 2024

Commit

ce02612

·

verified ·

1 Parent(s): 88fcd72

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -106,6 +106,10 @@ These datasets were machine translated into Portuguese and from the [extraGLUE](
 Furthermore, instruction templates have been manually crafted for each task.
 These take the various fields in the dataset and arrange them into prompts, which were collected into the [extraGLUE-instruct](https://huggingface.co/datasets/PORTULAN/extraglue-instruct) dataset.
 # Training Details
 We applied supervised fine-tuning with a causal language modeling training objective following a zero-out technique during the fine-tuning process.
@@ -120,8 +124,7 @@ In other words, each example occupies the full input sequence length.
 # Evaluation
 For testing, we reserved the translated datasets MRPC (similarity) and RTE (inference), from GLUE, and COPA (reasoning/qa), from SuperGLUE, which were taken as representatives of three major types of tasks, and were not seen during training.
-We also employ data augmentation techniques to enhance the size and diversity of our dataset.
-This involves repurposing the tasks in various ways, such as generation of answers from MultiRC, question generation from BoolQ, and other relevant modifications.
 | Model                    | MRPC (F1)      | RTE (F1)       | COPA (F1) |

 Furthermore, instruction templates have been manually crafted for each task.
 These take the various fields in the dataset and arrange them into prompts, which were collected into the [extraGLUE-instruct](https://huggingface.co/datasets/PORTULAN/extraglue-instruct) dataset.
+We also employ data augmentation techniques to enhance the size and diversity of our dataset.
+This involves repurposing the tasks in various ways, such as generation of answers from MultiRC, question generation from BoolQ, and other relevant modifications.
 # Training Details
 We applied supervised fine-tuning with a causal language modeling training objective following a zero-out technique during the fine-tuning process.
 # Evaluation
 For testing, we reserved the translated datasets MRPC (similarity) and RTE (inference), from GLUE, and COPA (reasoning/qa), from SuperGLUE, which were taken as representatives of three major types of tasks, and were not seen during training.
 | Model                    | MRPC (F1)      | RTE (F1)       | COPA (F1) |