Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with latest builds of iPXE on Mellanox ConnectX #115

Open
nshalman opened this issue Dec 11, 2023 · 2 comments
Open

Issue with latest builds of iPXE on Mellanox ConnectX #115

nshalman opened this issue Dec 11, 2023 · 2 comments

Comments

@nshalman
Copy link
Member

nshalman commented Dec 11, 2023

Current status

This is a ticket for tracking issues related to ipxe/ipxe#1091

There is a simple undef workaround that we can ship in our configuration until the upstream bug is fully resolved:
ipxe/ipxe#1091 (comment)

#undef NET_PROTO_EAPOL

We should ship that workaround until the upstream issue is resolved.

Original report

This is a ticket for tracking issues related to ipxe/ipxe#1091
For additional reference, netbootxyz/netboot.xyz#1359

Basically Equinix Metal is blocked on updating ipxedust because the latest builds of ipxe don't work on a bunch of our currently deployed fleet.

I have tested nshalman/ipxe@fbc3b4a as a revert of ipxe/ipxe@8b14652

Ideally this ticket will remain open until ipxe/ipxe#1091 is fixed properly.
One option for now is to ship the revert in ipxedust and keep applying the revert until a proper fix lands.
Otherwise I can maintain a fork for Equinix in the meantime.

But I do imagine other consumers of ipxedust with similar hardware will run into a similar issue in the meantime...

I intend to open a PR that would ship the revert, but I completely understand if the tinkerbell project chooses not to merge it.

nshalman added a commit to nshalman/ipxedust that referenced this issue Dec 11, 2023
Relates to: tinkerbell#115, ipxe/ipxe#1091
This would update iPXE to the latest master plus a revert
of a commit identified to break Mellanox NICs.

This is a temporary measure that might need to be repeated on
future updates of iPXE until ipxe/ipxe#1091 has been fixed correctly.
nshalman added a commit to nshalman/ipxedust that referenced this issue Dec 11, 2023
Relates to: tinkerbell#115, ipxe/ipxe#1091
This would update iPXE to the latest master plus a revert
of a commit identified to break Mellanox NICs.

This is a temporary measure that might need to be repeated on
future updates of iPXE until ipxe/ipxe#1091 has been fixed correctly.

Signed-off-by: Nahum Shalman <nshalman@equinix.com>
nshalman added a commit to nshalman/ipxedust that referenced this issue Dec 13, 2023
Relates to: tinkerbell#115, ipxe/ipxe#1091
This would update iPXE to the latest master plus a revert
of a commit identified to break Mellanox NICs.

This is a temporary measure that might need to be repeated on
future updates of iPXE until ipxe/ipxe#1091 has been fixed correctly.

Signed-off-by: Nahum Shalman <nshalman@equinix.com>
nshalman added a commit to nshalman/ipxedust that referenced this issue Dec 14, 2023
Relates to: tinkerbell#115, ipxe/ipxe#1091
This would update iPXE to the latest master plus a revert
of a commit identified to break Mellanox NICs.

This is a temporary measure that might need to be repeated on
future updates of iPXE until ipxe/ipxe#1091 has been fixed correctly.

Signed-off-by: Nahum Shalman <nshalman@equinix.com>
mergify bot added a commit that referenced this issue Dec 15, 2023
## Description

This would update iPXE to the latest master plus a revert of a commit identified to break Mellanox NICs.
This is a temporary measure that might need to be repeated on future updates of iPXE until ipxe/ipxe#1091 has been fixed correctly.

## Why is this needed

Relates to: #115, ipxe/ipxe#1091

## How Has This Been Tested?

I did repeated testing of iPXE binaries from my branches
[mainline-ipxe](https://github.com/nshalman/ipxedust/tree/mainline-ipxe)
[test-revert](https://github.com/nshalman/ipxedust/tree/test-revert)
to confirm that the issue exists and is addressed by the revert.

## How are existing users impacted? What migration steps/scripts do we need?

This would unbreak users of certain Mellanox NICs (Definitely Mellanox CX4, possibly others too)

## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
@nshalman
Copy link
Member Author

ipxe/ipxe#1091 (comment) suggests we could probably get away with an undef instead of needing a patch.

nshalman added a commit to nshalman/ipxedust that referenced this issue Feb 27, 2024
Resolves tinkerbell#117
Relates to tinkerbell#115

This release includes
ipxe/ipxe#1152 which resolves tinkerbell#117

I've also included the workaround for tinkerbell#115 as specified from
ipxe/ipxe#1091 (comment)
@nshalman nshalman mentioned this issue Feb 27, 2024
2 tasks
nshalman added a commit to nshalman/ipxedust that referenced this issue Feb 27, 2024
Resolves tinkerbell#117
Relates to tinkerbell#115

This release includes
ipxe/ipxe#1152 which resolves tinkerbell#117

I've also included the workaround for tinkerbell#115 as specified from
ipxe/ipxe#1091 (comment)

Signed-off-by: Nahum Shalman <nshalman@equinix.com>
@nshalman
Copy link
Member Author

ipxe/ipxe#1091 (comment) suggests we could probably get away with an undef instead of needing a patch.

Tested and appears to work.

mergify bot added a commit that referenced this issue Mar 1, 2024
## Description
Update to the latest iPXE

This release includes ipxe/ipxe#1152 which resolves #117.

I've also included the workaround for #115 as specified from ipxe/ipxe#1091 (comment) to keep Mellanox cards working.

## Why is this needed

Fixes: #117
Relates to: #115

## How Has This Been Tested?

I am drafting this PR as the first step towards testing. The plan is:
- [x] Test on an affected Mellanox card
- [x] Confirm that LetsEncrypt is working
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant