Use in SageMaker

#13

by joehoyle - opened Dec 4, 2023

Dec 4, 2023

When deploying the model to sagemaker via the hugging-face provided script, it's not clear how to call the inference APIs from the SageMaker Invoke API. Would it be possible to provide an example using aws sagemaker-runtime --cli-binary-format raw-in-base64-out invoke-endpoint --endpoint-name on what data should be posted for example, t2tt

joehoyle

Dec 5, 2023

•

edited Dec 5, 2023

Hmm it seems this perhaps relies on huggingface-transformers, and seamless-m4t-v2-large isn't in a released version of transformers. Presumably this means when SageMaker deployment is used it wouldn't work. It seems support for the upcoming transformers release may also need to be added in https://github.com/aws/sagemaker-huggingface-inference-toolkit/ before being able to deploy this model to SageMaker?

sanchit-gandhi

Jan 5

The latest release of transformers (4.36.2) contains the model - let us know how you get on!

radames

Jan 5

hi @joehoyle transformers version is pinned on the python package https://github.com/aws/sagemaker-python-sdk, maybe you can open an issue there.
Since transformers is updated frequently, there is some manual work to make it compatible, for instance there is an open PR to enable transformers v4.32.0 https://github.com/aws/sagemaker-python-sdk/issues/4075

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment