Embedding matrix size

by mrtnm - opened Aug 23, 2023

Aug 23, 2023

The following code shows that the embedding matrix is 384 in its first dimension.

from transformers import T5ForConditionalGeneration

model = T5ForConditionalGeneration.from_pretrained('google/byt5-base')
model.get_input_embeddings().weight.shape

torch.Size([384, 1536])

Why is this not 259, one vector for each possible value that one byte can represent, plus the 3 additional tokens mentioned in the paper?

I could not find any mention of this in the paper.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment