The output size when deployed in GCP is 1536 instead of 1024

#23

by bennegeek - opened Oct 4

Oct 4

•

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

tomaarsen

Oct 7

Hello!

The reason that the default behaviour uses 1024 is because it uses the Dense module from 2_Dense_1024: https://huggingface.co/dunzhang/stella_en_1.5B_v5/blob/main/modules.json#L15-L19
Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

I see you already created a clone of this model to try and fix it, but I think your fix might be wrong (i.e. you're not using any Dense anymore). I would fix it like this:

Clone the model
Rename 2_Dense to 2_Dense_1536
Rename 2_Dense_1024 to 2_Dense
Update modules.json to use 2_Dense instead of 2_Dense_1024.

Then both the Sentence Transformers and GCP should use the 1024 with the Dense module (which is important to get the correct performance!)

Tom Aarsen

bennegeek

Oct 10

•

edited Oct 10

Hi Tom, thanks for the response

I looked inside 2_Dense and saw this

 "out_features": 8192,

Does this mean the output when this layer is used is 8192 dimensions?

To me, it seems the one click GCP deployment doesn't use any of the 2_Dense_* layers.

tomaarsen

Oct 10

Yes. I would advise against using it, because the MTEB score of 1024d is only 0.001 lower than 8192d.

Having said that, I think my original assumption here:

Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

was wrong. I think GCP perhaps just doesn't use any Dense layer? This will result in worse performance I'm afraid.
@philschmid do you have some experience with this? Or @olivierdehaene due to TEI?

Tom Aarsen

bennegeek

Oct 11

I took a look at the TEI code and it seems TEI only reads the 1_Pooling layer. But I would definitely appreciate the view of someone who has expertise on that.

bennegeek

Oct 28

@philschmid @olivierdehaene any updates on the answer for Stella on TEI?

thekrishnarastogi

Nov 12

Hi @philschmid @olivierdehaene any work around or update on this? for Stella on TEI?

zwt0204

22 days ago

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

I encountered the same problem, changed the configuration, but the result is still 1536, deployed through /text-embeddings-inference.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment