Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(custom-resources): Package does not exist #30067

Open
athewsey opened this issue May 6, 2024 · 9 comments · May be fixed by #30095
Open

(custom-resources): Package does not exist #30067

athewsey opened this issue May 6, 2024 · 9 comments · May be fixed by #30095
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/small Small work item – less than a day of effort p2

Comments

@athewsey
Copy link

athewsey commented May 6, 2024

Describe the bug

I'm trying to use AwsCustomResource from Python for a couple of actions on @aws-sdk/client-cognito-identity-provider, and deployment keeps failing with errors like:

Received response status [FAILED] from custom resource. Message returned:
Package @aws-sdk/client-cognito-identity-provider does not exist. (RequestId: 99b79a89-1a17-4acf-864c-84b3ac3e5664)

Expected Behavior

The affected resource (see repro steps below) should deploy successfully and create a user in the provided Cognito user pool.

Current Behavior

I'm getting the above mentioned error message and the resource fails to create (or rollback/delete). Also tried providing the service name as CognitoIdentityServiceProvider but this gave the same error message (with @aws-sdk/client-cognito-identity-provider package name)

Possibly this may be intermittent, as I managed to get the stack to deploy (update existing to add this resource) at least once? But now facing the error consistently.

Reproduction Steps

Given Python CDK construct with a resource something like:

AwsCustomResource(
    self,
    "AwsCustomResource-CreateUser",
    on_create=AwsSdkCall(
        action="adminCreateUser",
        parameters={
            "UserPoolId": ...,
            "Username": ...,
            "MessageAction": "SUPPRESS",
            "TemporaryPassword": ...,
        },
        physical_resource_id=PhysicalResourceId.of(
            f"AwsCustomResource-CreateUser-{...}"
        ),
        service="@aws-sdk/client-cognito-identity-provider",
    ),
    on_delete=AwsSdkCall(
        action="adminDeleteUser",
        parameters={
            "UserPoolId": ...,
            "Username": ...,
        },
        service="@aws-sdk/client-cognito-identity-provider",
    ),
    policy=AwsCustomResourcePolicy.from_sdk_calls(
        resources=AwsCustomResourcePolicy.ANY_RESOURCE
    ),
    install_latest_aws_sdk=True,
)

...Try to deploy the stack

Possible Solution

🤷‍♂️

Additional Information/Context

Originally observed on CDK v1.126.0, so tried upgrading to 2.140.0 but it didn't help.

CDK CLI Version

2.140.0

Framework Version

2.140.0

Node.js Version

20.9.0

OS

macOS 14.4.1

Language

Python

Language Version

Python 3.12.1

Other information

Seems possibly related to #28005, which was closed due to inactivity but raised against an older CDK version.

@athewsey athewsey added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 6, 2024
@github-actions github-actions bot added the @aws-cdk/custom-resources Related to AWS CDK Custom Resources label May 6, 2024
@glitchassassin
Copy link

glitchassassin commented May 6, 2024

As of 11:00 EST on 5/3, we have been seeing a similar error with Python 3.10, CDK 2.134.0, using an AwsSdkCall for SSM's getParameter action. In our case the error is Package @aws-sdk/client-ssm does not exist.

cr.AwsCustomResource(
    self,
    "get_parameter",
    on_update=cr.AwsSdkCall(
        service="SSM",
        action="getParameter",
        parameters={
            "Name": parameter_name,
            "WithDecryption": True,
        },
        physical_resource_id=cr.PhysicalResourceId.of(
            str(datetime.utcnow()),
        ),
        region=region,
    ),
    policy=cr.AwsCustomResourcePolicy.from_sdk_calls(
        resources=[
            Stack.of(self).format_arn(
                service="ssm",
                region=region,
                resource="parameter",
                resource_name=parameter_name.lstrip("/"),
            )
        ]
    ),
)

The issue also appears to be intermittent for us.

@athewsey
Copy link
Author

athewsey commented May 6, 2024

For now, un-setting install_latest_aws_sdk seems to have stabilized our configuration (based on ~3 repeated deployments)... But I feel like it might be an intermittency thing / luck-of-the-draw, rather than a real remedy. Our full source code & patch commit available here

@glitchassassin it looks like you're not using the install_latest_aws_sdk option though right? And still seeing the issue?

@glitchassassin
Copy link

Correct, we are not.

On Friday, it failed on 2/6 deploys. Today we've had four successful releases so far and no failures. I'm configuring logging on the AwsSdkCall in hopes of capturing more details if it happens again

@glitchassassin
Copy link

glitchassassin commented May 6, 2024

Aha, tracked down some logs from Friday! They showed up by default in a Cloudwatch log group named /aws/lambda/[stack_name]-AWS[random hexadecimal]

Installing latest AWS SDK v3: @aws-sdk/client-ssm
Failed to install latest AWS SDK v3. Falling back to pre-installed version. Error: SyntaxError: Error parsing /tmp/node_modules/@smithy/shared-ini-file-loader/package.json: Unexpected end of JSON input

In another instance:

Installing latest AWS SDK v3: @aws-sdk/client-ssm
Failed to install latest AWS SDK v3. Falling back to pre-installed version. Error: Error: Cannot find module '@smithy/shared-ini-file-loader'
Require stack:

  • /tmp/node_modules/@smithy/node-config-provider/dist-cjs/index.js
  • /tmp/node_modules/@smithy/middleware-endpoint/dist-cjs/adaptors/getEndpointFromConfig.js
  • /tmp/node_modules/@smithy/middleware-endpoint/dist-cjs/index.js
  • /tmp/node_modules/@smithy/core/dist-cjs/index.js
  • /tmp/node_modules/@aws-sdk/client-ssm/dist-cjs/index.js
  • /var/task/index.js
  • /var/runtime/index.mjs

It seems like each time this runs, there's an initial attempt to install the SDK which always times out after 120 seconds (based on ResourceProperties in the logs, InstallLatestAwsSdk is true even though it isn't explicitly set in our code). The lambda is immediately invoked again, and this time the install either succeeds or fails in under a minute. If it fails, it says it is falling back to pre-installed version.

After the install, an Update request is logged, and it returns the parameter it's supposed to be fetching correctly (whether the install failed or succeeded).

Then, in some cases, there is a second Update request in the logs a couple minutes later, and that is where the "Package does not exist" error gets thrown. The request is identical to the first Update request except that the physicalResourceId is different (it's using the current date/time as described here.)

After reviewing our deployment logs, this seems to only have happened when we had back-to-back deployments within a couple minutes of each other, so the second deployment's Update request hits the same running lambda instance that was created by the first deployment.

It looks like when the Lambda doesn't get cleaned up after an install failure, the next Update request fails.

@glitchassassin
Copy link

glitchassassin commented May 6, 2024

Based on this:

function installLatestSdk(packageName: string): void {
console.log(`Installing latest AWS SDK v3: ${packageName}`);
// Both HOME and --prefix are needed here because /tmp is the only writable location
execSync(
`NPM_CONFIG_UPDATE_NOTIFIER=false HOME=/tmp npm install ${JSON.stringify(packageName)} --omit=dev --no-package-lock --no-save --prefix /tmp`,
);
installedSdk = {
...installedSdk,
[packageName]: true,
};
}
interface AwsSdk {
[key: string]: any;
}
async function loadAwsSdk(
packageName: string,
installLatestAwsSdk?: 'true' | 'false',
) {
let awsSdk: AwsSdk;
try {
if (!installedSdk[packageName] && installLatestAwsSdk === 'true') {
try {
installLatestSdk(packageName);
// MUST use require here. Dynamic import() do not support importing from directories
// esbuild-disable unsupported-require-call -- not esbuildable but that's fine
awsSdk = require(`/tmp/node_modules/${packageName}`);
} catch (e) {
console.log(`Failed to install latest AWS SDK v3. Falling back to pre-installed version. Error: ${e}`);
// MUST use require as dynamic import() does not support importing from directories
// esbuild-disable unsupported-require-call -- not esbuildable but that's fine
return require(packageName); // Fallback to pre-installed version
}

I wonder if the initial npm install failure is leaving /tmp/node_modules in an invalid state, but a subsequent npm install fails to detect the issue and thinks everything is installed?

Nope! It's actually failing on the require, not on the npm install command. So at this point installedSdk[packageName] is true. Next time it runs, the handler skips trying to install and falls through to the next block on the if statement:

} else if (installedSdk[packageName]) {
// MUST use require here. Dynamic import() do not support importing from directories
// esbuild-disable unsupported-require-call -- not esbuildable but that's fine
awsSdk = require(`/tmp/node_modules/${packageName}`);
} else {
// esbuild-disable unsupported-require-call -- not esbuildable but that's fine
awsSdk = require(packageName);
}

But there's no try/catch here, so this time when the require fails, it doesn't fall back to the pre-installed version.

@glitchassassin
Copy link

Drafting a PR with a fix

@khushail khushail added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels May 7, 2024
@khushail khushail added p2 effort/small Small work item – less than a day of effort and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels May 8, 2024
@khushail
Copy link
Contributor

khushail commented May 8, 2024

thanks @athewsey for reporting this issue. There have been multiple incidences of this issue reported by the customers recently

Thanks @glitchassassin for submitting a PR.

@ofiriluz
Copy link

Hi, any update on this?
we have started getting this as well when deleting Events Rule for some reason
"Package @aws-sdk/client-cloudwatch-logs does not exist"

This is holding our pipelines right now from fully passing

@glitchassassin
Copy link

Waiting on some guidance on the failing integration tests on the PR - I'm not sure how to resolve the build issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/custom-resources Related to AWS CDK Custom Resources bug This issue is a bug. effort/small Small work item – less than a day of effort p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants