jramompichel commited on
Commit
76126c8
·
verified ·
1 Parent(s): c806241

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -144,7 +144,7 @@ It was trained using HuggingFace Transformers and Pytorch, using the [Causal Mod
144
 
145
  ### Language adaptation and training
146
 
147
- The language adaptation technique used to train FLOR-1.3B-GL is based in the used to train FLOR-1.3B, which is explanied by their authors in this [Medium Post](https://medium.com/@mpamies247/flor-6-3b-a-chinchilla-compliant-model-for-catalan-spanish-and-english-7cdb389a9aac). In summary, we proceeded as follows:
148
  1) We trained our own BPE tokenizer for galician and replaced the original FLOR-1.3B tokenizer and vocabulary with it.
149
  2) The embeddings corresponding to tokens that are present in both the original and the target vocabulary (matching tokens) were used for initialization.
150
  3) The embeddings from tokens not present in FLOR-1.3-GL's original vocabulary were initialized as the average of all embeddings.
 
144
 
145
  ### Language adaptation and training
146
 
147
+ The language adaptation technique used to train FLOR-1.3B-GL is based in the used to train FLOR-1.3B, which is explained by their authors in this [Medium Post](https://medium.com/@mpamies247/flor-6-3b-a-chinchilla-compliant-model-for-catalan-spanish-and-english-7cdb389a9aac). In summary, we proceeded as follows:
148
  1) We trained our own BPE tokenizer for galician and replaced the original FLOR-1.3B tokenizer and vocabulary with it.
149
  2) The embeddings corresponding to tokens that are present in both the original and the target vocabulary (matching tokens) were used for initialization.
150
  3) The embeddings from tokens not present in FLOR-1.3-GL's original vocabulary were initialized as the average of all embeddings.