The output size when deployed in GCP is 1536 instead of 1024

#23

by bennegeek - opened Oct 4, 2024

Oct 4, 2024

•

edited Oct 5, 2024

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

tomaarsen

Oct 7, 2024

Hello!

The reason that the default behaviour uses 1024 is because it uses the Dense module from 2_Dense_1024: https://huggingface.co/dunzhang/stella_en_1.5B_v5/blob/main/modules.json#L15-L19
Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

I see you already created a clone of this model to try and fix it, but I think your fix might be wrong (i.e. you're not using any Dense anymore). I would fix it like this:

Clone the model
Rename 2_Dense to 2_Dense_1536
Rename 2_Dense_1024 to 2_Dense
Update modules.json to use 2_Dense instead of 2_Dense_1024.

Then both the Sentence Transformers and GCP should use the 1024 with the Dense module (which is important to get the correct performance!)

Tom Aarsen

bennegeek

Oct 10, 2024

•

edited Oct 10, 2024

Hi Tom, thanks for the response

I looked inside 2_Dense and saw this

 "out_features": 8192,

Does this mean the output when this layer is used is 8192 dimensions?

To me, it seems the one click GCP deployment doesn't use any of the 2_Dense_* layers.

tomaarsen

Oct 10, 2024

Yes. I would advise against using it, because the MTEB score of 1024d is only 0.001 lower than 8192d.

Having said that, I think my original assumption here:

Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

was wrong. I think GCP perhaps just doesn't use any Dense layer? This will result in worse performance I'm afraid.
@philschmid do you have some experience with this? Or @olivierdehaene due to TEI?

Tom Aarsen

bennegeek

Oct 11, 2024

I took a look at the TEI code and it seems TEI only reads the 1_Pooling layer. But I would definitely appreciate the view of someone who has expertise on that.

bennegeek

Oct 28, 2024

@philschmid @olivierdehaene any updates on the answer for Stella on TEI?

thekrishnarastogi

Nov 12, 2024

Hi @philschmid @olivierdehaene any work around or update on this? for Stella on TEI?

zwt0204

Dec 6, 2024

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

I encountered the same problem, changed the configuration, but the result is still 1536, deployed through /text-embeddings-inference.

bennegeek

23 days ago

You can use the fork i made, it's 1536 dims for both TEI and loading into SentenceTransformer. https://huggingface.co/bennegeek/stella_en_1.5B_v5/

tomaarsen

23 days ago

I believe the performance will be worse if you have 1536 dims, i.e. if you're not using the Dense module.

Tom Aarsen

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment