versae commited on
Commit
cb69a58
1 Parent(s): 94523fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -83,7 +83,7 @@ The dataset is a translation to Spanish of [alpaca_data_cleaned.json](https://gi
83
 
84
  ## Finetuning
85
 
86
- To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 with no gradient accumulation on a single TPUv3-8 for 3 hours on top of BERTIN GPT-J-6B.
87
 
88
  ## Example outputs
89
 
 
83
 
84
  ## Finetuning
85
 
86
+ To fine-tune the BERTIN GPT-J-6B model we used the code available on [BERTIN's fork of `mesh-transformer-jax`](https://github.com/bertin-project/mesh-transformer-jax/blob/master/prepare_dataset_alpaca.py), which provides code adapt an Alpaca dataset to finetune any GPT-J-6B model. We run finetuning for 3 epochs using sequence length of 2048 on a single TPUv3-8 for 3 hours on top of BERTIN GPT-J-6B.
87
 
88
  ## Example outputs
89