bertin-project
/

bertin-gpt-j-6B-alpaca

Text Generation

Inference Endpoints

Model card Files Files and versions Community

versae commited on Mar 29, 2023

Commit

cb69a58

•

1 Parent(s): 94523fa

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -83,7 +83,7 @@ The dataset is a translation to Spanish of [alpaca_data_cleaned.json](https://gi
 ## Finetuning
-To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 with no gradient accumulation on a single TPUv3-8 for 3 hours on top of BERTIN GPT-J-6B.
 ## Example outputs

 ## Finetuning
+To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 on a single TPUv3-8 for 3 hours on top of BERTIN GPT-J-6B.
 ## Example outputs