Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very rare ReferenceError #2970

Open
tsah-alike opened this issue Jun 15, 2023 · 7 comments
Open

very rare ReferenceError #2970

tsah-alike opened this issue Jun 15, 2023 · 7 comments

Comments

@tsah-alike
Copy link

Describe the bug

We are running python 3.8 on AWS lambda. We use boto3. The code is patched by aws-xray-sdk and lumigo tracer. Very rarely (every few months) we encounter a ReferenceError. This will happen again and again as long as the same instance of lambda is reused.

We could not find a way to reproduce it. All we have is the stack trace.

Expected Behavior

Should not raise ReferenceError.

Current Behavior

ReferenceError is raised

This particular one happened during PutItem on a DynamoDB Table object.

[ERROR] ReferenceError: weakly-referenced object no longer exists
*** application part of stack trace ***
table_handler.update_item(
File "/var/runtime/boto3/resources/factory.py", line 580, in do_action
response = action(self, *args, **kwargs)
File "/var/runtime/boto3/resources/action.py", line 88, in call
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "/opt/python/botocore/client.py", line 530, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/python/wrapt/wrappers.py", line 644, in call
return self._self_wrapper(self.wrapped, self._self_instance,
File "/opt/python/aws_xray_sdk/ext/botocore/patch.py", line 38, in _xray_traced_botocore
return xray_recorder.record_subsegment(
File "/opt/python/aws_xray_sdk/core/recorder.py", line 462, in record_subsegment
six.raise_from(exc, exc)
File "", line 3, in raise_from
File "/opt/python/aws_xray_sdk/core/recorder.py", line 457, in record_subsegment
return_value = wrapped(*args, **kwargs)
File "/opt/python/botocore/client.py", line 943, in _make_api_call
http, parsed_response = self._make_request(
File "/opt/python/botocore/client.py", line 966, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/opt/python/botocore/endpoint.py", line 119, in make_request
return self._send_request(request_dict, operation_model)
File "/opt/python/botocore/endpoint.py", line 198, in _send_request
request = self.create_request(request_dict, operation_model)
File "/opt/python/botocore/endpoint.py", line 134, in create_request
self._event_emitter.emit(
File "/opt/python/botocore/hooks.py", line 412, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/opt/python/botocore/hooks.py", line 256, in emit
return self._emit(event_name, kwargs)
File "/opt/python/botocore/hooks.py", line 239, in _emit
response = handler(**kwargs)
File "/opt/python/botocore/signers.py", line 105, in handler
return self.sign(operation_name, request)
File "/opt/python/botocore/signers.py", line 149, in sign
signature_version = self._choose_signer(
File "/opt/python/botocore/signers.py", line 219, in _choose_signer
handler, response = self._event_emitter.emit_until_response(

Reproduction Steps

I'm sorry, we did not manage to reproduce this.

Possible Solution

It seems like the RequestSigner class holds a weak reference to some object, but the case of that object being GCd is not dealt with.
to fix, surround the expression in botocore/signers.py", line 219 with a try/catch block, and handle the case of ReferenceError

Additional Information/Context

We are running python 3.8 on AWS lambda. We use the official runtime. We use boto3. The code is patched by aws-xray-sdk and lumigo tracer.

SDK version used

unknown, included with python3.8 AWS Lambda runtime

Environment details (OS name and version, etc.)

python3.8 lambda runtime, intel processor

@tsah-alike tsah-alike added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jun 15, 2023
@tim-finnigan tim-finnigan self-assigned this Jun 15, 2023
@tim-finnigan
Copy link
Contributor

Thanks @tsah-alike for reaching out. Which version of botocore are you using? Can you share any code snippets that resulted in this error?

I think it would be worth opening an issue directly with the aws-xray-sdk-python repository for this.

@tim-finnigan tim-finnigan added response-requested Waiting on additional info and feedback. and removed bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jun 15, 2023
@tsah-alike
Copy link
Author

Thanks for responding @tim-finnigan, I'll open an issue there as well.
The version is unknown since it's coming from the AWS Lambda runtime. My guess is it's pretty recent but not the most recent.

@tsah-alike
Copy link
Author

We first noticed this bug about a year ago.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Jun 19, 2023
@tim-finnigan
Copy link
Contributor

Hi @tsah-alike thanks for following up. Per the documentation on Lambda runtimes the packaged botocore version would be botocore-1.29.90. And it looks like aws-xray-sdk-python accepts versions going back as far as 1.11.3. You could confirm your version by checking the logs (adding boto3.set_stream_logger('') to your script) or just importing and printing it:

import botocore
print(botocore.__version__)

I'll link the related issue you created in the other repository: aws/aws-xray-sdk-python#394

If you can share any other details such as code snippets or steps to reproduce then that may help narrow down the issue.

@tim-finnigan tim-finnigan removed their assignment Jun 19, 2023
@tim-finnigan tim-finnigan added the response-requested Waiting on additional info and feedback. label Jun 19, 2023
@tsah-alike
Copy link
Author

The version is 1.29.156.
We couldn't create a minimal working example. The last failure was something like this (simplified):

        config = Config(connect_timeout=1, read_timeout=5, retries={'max_attempts': 3})
        session = boto3.Session()
        resource = session.resource('dynamodb', config=config)
       ... business logic ...
       res = resource.query(KeyConditionExpression=Key('post_id').eq(post_id), IndexName='post_id')
       ... business logic ...
       resource.update_item(
                    Key=key,
                    UpdateExpression='set is_deleted = :is_deleted',
                    ExpressionAttributeValues={':is_deleted': True}
        )
      ^^^ ReferenceError is thrown here

This worked perfectly for months, but once that ReferenceError was thrown, the same lambda failed the exact same way 266 times, even though the botocore session and the DDB resource are recreated each time. Once the lambda instance was replaced, it stopped happening, and it worked fine ever since (last week).

@tsah-alike
Copy link
Author

I know it's not a lot of information. I did try my best to find steps to reproduce.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Jun 20, 2023
@StickStack
Copy link

Encountered same issue implementing https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/LambdaRedis.step2.html and running a Python3.10 lambda. Locally on my mac I do not get this issue unless it is an async function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants