Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not work on Fresh EKS Cluster with Amazon Linux 2023 AMI Type Nodes #3695

Closed
sschamp opened this issue May 14, 2024 · 6 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@sschamp
Copy link

sschamp commented May 14, 2024

Describe the bug
The pods fail to run on EKS Nodes which are using AL2023 instead of AL2:

{"level":"info","ts":"2024-05-14T10:15:35Z","msg":"version","GitVersion":"v2.7.2","GitCommit":"fb6460383b75e937e24548e69b6732f49b88755c","BuildDate":"2024-03-22T21:39:56+0000"}
{"level":"error","ts":"2024-05-14T10:15:38Z","logger":"setup","msg":"unable to initialize AWS cloud","error":"failed to introspect vpcID from EC2Metadata or Node name, specify --aws-vpc-id instead if EC2Metadata is unavailable: failed to fetch VPC ID from instance metadata: EC2MetadataError: failed to make EC2Metadata request\n\n\tstatus code: 401, request id: "}

Steps to reproduce

  • Create a NodeGroup with any Instance Type but set the AMI Type to:
    Amazon Linux 2023 (x86_64) Standard (AL2023_x86_64_STANDARD)
  • Schedule aws-load-balancer-controller-* on the new Nodes (either by setting NodeAffinity, or by using a New Cluster)

Expected outcome
The Pods to be able to read the meta-data of the Node Instance.

Environment

  • AWS Load Balancer controller version: 2.7.2
  • Kubernetes version: EKS 1.29

Additional Context:

It might be because AL2023 no longer allows you to query http://169.254.169.254/latest/meta-data/ directly.
They have started using IMDSv2 instead of IMDSv1. (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html)
You need to provide a token first:

curl -s http://169.254.169.254/latest/meta-data/ --header "X-aws-ec2-metadata-token: $TOKEN"

eg:
/usr/bin/curl --noproxy '*' -w "\n" -s -H "X-aws-ec2-metadata-token: $(curl --noproxy '*' -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" http://169.254.169.254/latest/meta-data/instance-id

@aravindsagar
Copy link

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 15, 2024
@oliviassss
Copy link
Collaborator

@sschamp, hi, it might be the ec2 instance use hop limit as 1 for default, can you change the hop limit to 2 and see if it fixes the issue? or you can specify the --aws-vpc-id directly through the controller flag.
see: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.7/deploy/installation/#using-the-amazon-ec2-instance-metadata-server-version-2-imdsv2

@sschamp
Copy link
Author

sschamp commented May 27, 2024

I went with the option of manually specifying --aws-vpc-id and all is well again.
This issue can be closed.

@oliviassss
Copy link
Collaborator

Thanks for the confirmation, closing it now.

@fcuello-fudo
Copy link

Thanks for the confirmation, closing it now.

Can we please reopen? Although setting --aws-vpc-id works this is not matching the documentation, which states that if it's not specified it will be auto-detected (which works for Amazon Linux 2 but not for 2023). This is a breaking change that more people will likely encounter as AL2023 get more exposed.

@oliviassss
Copy link
Collaborator

@fcuello-fudo, can you check your instance hop limit? in order for the controller to fetch the vpc id it requires the hop limit to be at least 2
we call out in live doc: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.8/deploy/installation/#using-the-amazon-ec2-instance-metadata-server-version-2-imdsv2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants