Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale up lambda failed #346

Closed
otani88 opened this issue Nov 17, 2020 · 23 comments
Closed

Scale up lambda failed #346

otani88 opened this issue Nov 17, 2020 · 23 comments
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@otani88
Copy link

otani88 commented Nov 17, 2020

Hi. I've error on lambda scale up after setup your module.
Cloudwatch logs below:

ERROR	Invoke Error 	
{
    "errorType": "Error",
    "errorMessage": "Failed handling SQS event",
    "stack": [
        "Error: Failed handling SQS event",
        "    at _homogeneousError (/var/runtime/CallbackContext.js:12:12)",
        "    at postError (/var/runtime/CallbackContext.js:29:54)",
        "    at callback (/var/runtime/CallbackContext.js:41:7)",
        "    at /var/runtime/CallbackContext.js:104:16",
        "    at /var/task/index.js:16834:16",
        "    at Generator.throw (<anonymous>)",
        "    at rejected (/var/task/index.js:16816:65)",
        "    at processTicksAndRejections (internal/process/task_queues.js:97:5)"
    ]
}
    at /var/task/index.js:15124:23
    at processTicksAndRejections (internal/process/task_queues.js:97:5) {
  status: 403,
  headers: {
    'access-control-allow-origin': '*',
    'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset',
    connection: 'close',
    'content-encoding': 'gzip',
    'content-security-policy': "default-src 'none'",
    'content-type': 'application/json; charset=utf-8',
    date: 'Tue, 17 Nov 2020 17:51:47 GMT',
    'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin',
    server: 'GitHub.com',
    status: '403 Forbidden',
    'strict-transport-security': 'max-age=31536000; includeSubdomains; preload',
    'transfer-encoding': 'chunked',
    vary: 'Accept-Encoding, Accept, X-Requested-With',
    'x-content-type-options': 'nosniff',
    'x-frame-options': 'deny',
    'x-github-media-type': 'github.v3; format=json',
    'x-github-request-id': '93DE:E7C5:957F272:AC944E7:5FB40DB3',
    'x-ratelimit-limit': '5600',
    'x-ratelimit-remaining': '5598',
    'x-ratelimit-reset': '1605639047',
    'x-ratelimit-used': '2',
    'x-xss-protection': '1; mode=block'
  },
  request: {
    method: 'GET',
    url: 'https://api.github.com/repos/RaketaApp/packer-base-ami/actions/runs?status=queued',
    headers: {
      accept: 'application/vnd.github.v3+json',
      'user-agent': 'octokit-rest.js/18.0.6 octokit-core.js/3.1.1 Node.js/12.18.4 (linux; x64)',
      authorization: 'token [REDACTED]'
    },
    request: { hook: [Function: bound bound register] }
  },
  documentation_url: 'https://docs.github.com/rest/reference/actions#list-workflow-runs-for-a-repository'
}``` 

@adrianmiron
Copy link

I had the same error, try giving the app rights on the Actions group.

@otani88
Copy link
Author

otani88 commented Nov 22, 2020

@adrianmiron Have You fixed it?

@WilliamDuTrendMicro
Copy link

same issue +1

@otani88
Copy link
Author

otani88 commented Nov 22, 2020

@npalm Could you please help me?

@manoj-k-deepr
Copy link

@adrianmiron i tried Actions group but still having issue. Can you share all your permissions? I am trying organization runner.

@npalm
Copy link
Member

npalm commented Nov 23, 2020

I do not recognize the issue The scale up lambda is fetching a messange from the queue, next it checks if there are still queued jobs. If yes it is scaling up. The scale up lambda is triggered for messages that are for 30 seconds on teh queue. The error message indicates the lambda is not allowed to call the API.

Please can you check if you GitHub app is setup according the docs. Since your scale up lambda is triggered it seems the app is installed for the repo, otherwise no event should be received. So most ligical looks like the permissions are not set right.

@adrianmiron
Copy link

@manoj-k-deepr From my investigations of the same error, it turned out to be permission issues of the github app ( which is the one actually doing the query to the repo actions. I remember i went over the lambda -> github app thing 5 times and it was not it.

Share a printscreen with permissions on organisation/repo and i will compare in the morning.

@manoj-k-deepr
Copy link

manoj-k-deepr commented Nov 24, 2020

@npalm yes its issue with permission. I fixed the issue by providing Self-hosted runners access (Read & Write) in organization . In docs nothing mentioned about runner permission.

@otani88
Copy link
Author

otani88 commented Nov 24, 2020

There was problem with Github app permissions. @npalm Can You update documentation and specify what permission application requires

@npalm
Copy link
Member

npalm commented Nov 24, 2020

@mkryva Great you got it working. I will leave the issue open so we can update the docs. PR's for improving the docs are always welcome!

@npalm npalm added documentation Improvements or additions to documentation question Further information is requested labels Nov 24, 2020
@ernado
Copy link

ernado commented Dec 19, 2020

After updating permissions, it fails with following error:

ERROR AuthFailure.ServiceLinkedRoleCreationNotPermitted: The provided credentials do not have permission to create the service-linked role for EC2 Spot Instances.

UPD: Looks like the reason was "You've reached your quota for maximum Spot Fleet Requests for this account."

@buamod
Copy link

buamod commented Jan 21, 2021

scale up lambda failing for me, even after the latest commit of (ghes) fix by @mcaulifn


DEBUG	https://enterprise.github.custom.com/api/v3

ERROR	RequestError [HttpError]: request to https://enterprise.github.custom.com/api/v3/app/installations/22/access_tokens 
failed, reason: connect ETIMEDOUT 192.168.1.1:443
    at /var/task/index.js:2797:11
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at async getInstallationAuthentication (/var/task/index.js:266:7) {
  status: 500,
  headers: {},
  request: {
    method: 'POST',
    url: 'https://enterprise.github.custom.com/api/v3/app/installations/22/access_tokens',
    headers: {
      accept: 'application/vnd.github.antiope-preview+json,application/vnd.github.machine-man-preview+json',
      'user-agent': 'octokit-request.js/5.4.12 Node.js/12.19.0 (linux; x64)',
      authorization: 'bearer [REDACTED]',
      'content-length': 0
    }
  }
}

ERROR RequestError [HttpError]: request to https://enterprise.github.custom.com/api/v3/app/installations/22/access_tokens failed, 
reason: connect ETIMEDOUT 192.168.1.1:443 at /var/task/index.js:2797:11 at processTicksAndRejections 
(internal/process/task_queues.js:97:5) at async getInstallationAuthentication (/var/task/index.js:266:7) 
{ status: 500, headers: {}, request: { 
    method: 'POST', url: 'https://enterprise.github.custom.com/api/v3/app/installations/22/access_tokens', 
    headers: { accept: 'application/vnd.github.antiope-preview+json,application/vnd.github.machine-man-preview+json', 
    'user-agent': 'octokit-request.js/5.4.12 Node.js/12.19.0 (linux; x64)', authorization: 'bearer [REDACTED]', 
    'content-length': 0 } } }



ERROR	Invoke Error 	
{
    "errorType": "Error",
    "errorMessage": "Failed handling SQS event",
    "stack": [
        "Error: Failed handling SQS event",
        "    at _homogeneousError (/var/runtime/CallbackContext.js:12:12)",
        "    at postError (/var/runtime/CallbackContext.js:29:54)",
        "    at callback (/var/runtime/CallbackContext.js:41:7)",
        "    at /var/runtime/CallbackContext.js:104:16",
        "    at /var/task/index.js:50911:16",
        "    at Generator.throw (<anonymous>)",
        "    at rejected (/var/task/index.js:50893:65)",
        "    at processTicksAndRejections (internal/process/task_queues.js:97:5)"
    ]
}

@npalm
Copy link
Member

npalm commented Jan 21, 2021

@buamod you are using GHES? Right? Just to be sure, did you rebuild the lambda, and ensured it is used?

@mcaulifn
Copy link
Contributor

ETIMEDOUT would suggest GHES did not respond. Are you behind a proxy?

@buamod
Copy link

buamod commented Jan 21, 2021

@buamod you are using GHES? Right? Just to be sure, did you rebuild the lambda, and ensured it is used?

I did deploy the latest commit lambdas, I did build them with docker commands from the Ci/build.sh script.

@buamod
Copy link

buamod commented Jan 21, 2021

ETIMEDOUT would suggest GHES did not respond. Are you behind a proxy?

There might be a proxy I don't know. Let's say there is a proxy how would I pass that ?

@mcaulifn
Copy link
Contributor

@buamod proxy requirements can differ greatly. I would suggest contacting your network team for what they need to pass the connection.

@anupash147
Copy link

check if you have / character in your webhook secret. This may cause issue in aws lambda.

@halil9
Copy link

halil9 commented Jul 28, 2021

Is there any update for this issue guys I have the same issue Failed handling SQS event and my request status is 404 in cloudwatch. Github apps permissions are the same in doc. By the way, I have no / character in my webhook secret.

@npalm
Copy link
Member

npalm commented Jul 28, 2021

A 404 is most likely a wrong configured webhook, did you use the full URL in the webhook? Like https://abcdef.execute-api.eu-west-1.amazonaws.com/webhook. The webhook lambda does not response with a 404 so most likely something is wrong in the URL. See the screenshot which is a result of a wrong (but existing API gateway endpoint).

image

@skyzyx
Copy link
Contributor

skyzyx commented Oct 26, 2021

Getting here a little late, but the SQS error message is a red-herring. We're also using this with GHES, and I added a little more debug logging to the Lambdas.

What's happening is that if you are running the Lambdas inside your VPC, you may not be able to access services outside the VPC, but have no problem with services inside the VPC. The solution is that you need to ensure that the VPC subnets you pick are configured to communicate through a NAT gateway. That way, it can communicate with services inside AND outside the VPC.

There isn't a one-size-fits-all bit of Terraform that you can run, since changes at the VPC level will impact all traffic in that particular network. You'll probably need to talk to whomever has deep knowledge about how your VPC is configured.

@ScottGuymer
Copy link
Member

Closing as this seems old and hopefully resolved.

Please re-open or create a new issue if you are still experiencing problems.

@andre-lx
Copy link

After updating permissions, it fails with following error:

ERROR AuthFailure.ServiceLinkedRoleCreationNotPermitted: The provided credentials do not have permission to create the service-linked role for EC2 Spot Instances.

UPD: Looks like the reason was "You've reached your quota for maximum Spot Fleet Requests for this account."

Hi @ernado . How did you solved the "You've reached your quota for maximum Spot Fleet Requests for this account."?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests