Embedding matrix size
#3
by
mrtnm
- opened
The following code shows that the embedding matrix is 384 in its first dimension.
from transformers import T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained('google/byt5-base')
model.get_input_embeddings().weight.shape
torch.Size([384, 1536])
Why is this not 259, one vector for each possible value that one byte can represent, plus the 3 additional tokens mentioned in the paper?
I could not find any mention of this in the paper.