You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Using a Lambda function with boto3 to query the neuron llama2 7b f model deployed on a ML INF2 XLARGE instance, the invoke endpoint operation fails with the following message:
{
"errorMessage": "An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message \"{\n \"code\": 400,\n \"type\": \"BadRequestException\",\n \"message\": \"Parameter model_name is required.\"\n}\n\". See https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logEventViewer:group=/aws/sagemaker/Endpoints/testllamaneuron in account XXXXXXX for more information.",
"errorType": "ModelError",
"requestId": "2f2a7aa4-9eeb-42f5-9a14-6285894581bb",
"stackTrace": [
" File \"/var/task/lambda.py\", line 19, in handler\n response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,\n",
" File \"/var/runtime/botocore/client.py\", line 530, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 960, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
Create a lambda function to query the endpoint with the following code:
import boto3
import json
def handler(event, context):
runtime= boto3.client('runtime.sagemaker')
ENDPOINT_NAME = 'testllamaneuron'
dic = {
"inputs": [
[
{"role": "system", "content": "You are chat bot who writes songs"},
{"role": "user", "content": "Write a rap song about Amazon Web Services"}
]
],
"parameters": {"max_new_tokens":256, "top_p":0.9, "temperature":0.6}
}
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=json.dumps(dic),
CustomAttributes="accept_eula=true")
result = json.loads(response['Body'].read().decode())
print(result)
return {
"statusCode": 200,
"body": json.dumps(result)
}
Logs
Lambda Function logs:
[ERROR] ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "BadRequestException",
"message": "Parameter model_name is required."
}
The text was updated successfully, but these errors were encountered:
Link to the notebook
https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/aws-trainium-inferentia-finetuning-deployment/llama-2-trainium-inferentia-finetuning-deployment.ipynb
Describe the bug
Using a Lambda function with boto3 to query the neuron llama2 7b f model deployed on a ML INF2 XLARGE instance, the invoke endpoint operation fails with the following message:
The model configuration is as follow:
To reproduce
Logs
Lambda Function logs:
The text was updated successfully, but these errors were encountered: