Problem on SageMaker
Hello
I tried to deploy the model using Sagemaker, I used deploy_llava.ipynb to deploy it.
Endpoint status as InService but once I tried to call the model using:
data = {
"image" : "https://raw.githubusercontent.com/haotian-liu/LLaVA/main/images/llava_logo.png",
"question" : "Describe the image and color details."
# "max_new_tokens" : 1024,
# "temperature" : 0.2,
# "stop_str" : "###"
}
output = predictor.predict(data)
print(output)
I got the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "model_fn() takes 1 positional argument but 2 were given"
}
Any idea?
The full log:
ModelError Traceback (most recent call last)
Cell In[7], line 10
1 data = {
2 "image" : 'https://raw.githubusercontent.com/haotian-liu/LLaVA/main/images/llava_logo.png',
3 "question" : "Describe the image and color details."
(...)
6 # "conv_mode" : "llava_v1"
7 }
9 # request
---> 10 output = predictor.predict(data)
11 print(output)
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/sagemaker/base_predictor.py:209, in Predictor.predict(self, data, initial_args, target_model, target_variant, inference_id, custom_attributes, component_name)
206 if inference_component_name:
207 request_args["InferenceComponentName"] = inference_component_name
--> 209 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
210 return self._handle_response(response)
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/botocore/client.py:553, in ClientCreator._create_api_method.._api_call(self, *args, **kwargs)
549 raise TypeError(
550 f"{py_operation_name}() only accepts keyword arguments."
551 )
552 # The "self" in this scope is referring to the BaseClient.
--> 553 return self._make_api_call(operation_name, kwargs)
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/botocore/client.py:1009, in BaseClient._make_api_call(self, operation_name, api_params)
1005 error_code = error_info.get("QueryErrorCode") or error_info.get(
1006 "Code"
1007 )
1008 error_class = self.exceptions.from_code(error_code)
-> 1009 raise error_class(parsed_response, operation_name)
1010 else:
1011 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "model_fn() takes 1 positional argument but 2 were given"
}