getting different embedding values for the same image when trying to generate embeddings from two different sagemaker instances of the same model.
#26
by
rnekkanti
- opened
Hi All,
I have this model deployed in multiple endpoints. Each endpoint is built has its own model.tar.gz file , but they all are built using the same base repo. When I try to generate the image embeddings for these endpoints, I would expect same embedding values generated when I pass the same image to these endpoints.
But instead I am seeing a difference in the values generated.
Anybody else face this issue?
Hi Rajendra,
I had a similar problem with embedding models when using different GPU instances and with batch sizes different from the ones used in the train.
Here are some references:
- https://openreview.net/pdf?id=9MDjKb9lGi#:~:text=We%20have%20experimentally%20shown%20that,size%20of%20the%20input%20matrix
- https://discuss.pytorch.org/t/different-training-results-on-different-machines-with-simplified-test-code/59378
Hope this helps you.
Best regards,
Milutin