-
Notifications
You must be signed in to change notification settings - Fork 35.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: Replace libnatpmp with built-in PCP+NATPMP implementation #30043
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
e4da306
to
d4da95f
Compare
Concept ACK |
Whoa :) (Concept ACK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC6887 appendix A describes how a router that supports NAP-PMP but not PCP, will return UNSUPP_VERSION
. A log message could encourage users to try -upnp
instead (if not already enabled). Or upgrade their ancient router firmware :-)
Got distracted during review, will continue later. I'll look into how we can preserve a previously selected NatPMP checkbox in the GUI.
Regarding the custom lib/self-impl discussion in #30005: That said, it is a little rough to review as
But it seems worth the effort to me. |
Rebased for #29984 |
If you're more comfortable comparing it against another implementation there's:
Miniupnpd's is, likely, the most common server implementation in the wild. It's been tested against that. i'm hoping people will test this on various routers and conditions, there will be inevitable edge cases to iron out.
Maybe, but at some point there's not much difference between implementing a simple request/reply protocol and using a library. At least the RFC is unambigiously documented, which can't be said of many FOSS ABI's. Also, @fanquake (and @gmaxwell in the past) has been hoping for a solution that doesn't introduce a library dependency. |
Would prefer this in two steps (add PCP, then remove NAT-PMP) |
Edit: no longer relevant now that this implements PCP with NAT-PMP fallaback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on Ubuntu 24.04 I'm getting in bound via IPv4 and IPv6.
When a GUI user previously had NAT-PMP selected there would be a "natpmp": true
field in settings.json
.
We need to rename this to "tcp"
in order for the box to remain checked.
Maybe @ryanofsky has an idea how to do this elegantly?
A little hack that works is to put the following at the end of ReadSettings()
in settings.cpp
:
// Migrate settings:
if (values.contains("natpmp")) {
auto el = values.extract("natpmp");
el.key() = "pcp";
values.insert(std::move(el));
}
A bit nicer would be to have a MigrateSettingsFile()
that's called between ReadSettingsFile
and WriteSettingsFile
in init.cpp
.
(not to be confused with OptionsModel::checkAndMigrate()
which is called before settings.json is read)
Maybe you can split src/util/netif.h
off into a separate PR, along with b06edec.
In terms of which IPv6 address to use, my understanding is this:
If we were able to tell which one one is which, then I think we should pick only one, in order of preference: (2), (3), (1). This reflects our desire to actually get inbound connections, even after a shutdown, while at the same time not doxxing ourselves when on the move (mainly for laptops, perhaps mobile in the future). We can easily detect type (1) by looking for I believe you can detect (2) by looking for Afaik there's also no guarantee about the order of results. So maybe we should just pick one at random rather than announce all three. This also seems orthogonal to PCP. |
Actually this might be an existing bug: miniupnp/miniupnp@7bd0877 I'll see if I can convince OPNsense to ship >= 2.3.6, tracking at opnsense/plugins#4003. |
That does look like an issue that would affect renewal, but i'm not sure it would cause this specific issue, if i understand correctly it'd bump the renewal timestamp so far it would live forever. It wouln't make it complain about the port being used. Unless it's an old mapping that sticks around. But that wouldn'e explain why it only happens on renewal. |
That doesn't seem like a good idea. IIUC the nonce is used to recognise the requesting application, and to distinguish a renew request from some other app trying to use the same port. |
It's not supposed to be a good idea, but i wonder if it works around the issue with your router, if they implemented the protocol wrongly. i honestly have no idea what could be the problem here i'm just guessing. |
I thought perhaps the issue was that miniupnpd checks whether at least half the lease time went by and would refuse if not. So then if the lease time was too far in the future, it wouldn't allow renewal. But looking through the source code it doesn't appear to care about that. There's a few places that can trigger a The pinhole code is mostly seperate from port mappings. Depending on Figured out how to get verbose logging (opnsense/plugins#4004). It does log
I'm probably still missing something... I should test if deleting an IPv6 pinhole does work. Deleting a pinhole succeeds. The log message is wrong ( I also tried opening only a single pinhole, to rule out some race condition. But that doesn't make a difference. Anyway, I'm fairly sure it's not a bug in this PR. |
Thanks for investigating in detail. |
Makes the name of the constant (approximately) match the behavior again.
Both our setups are at miniupnpd 2.3.3; the plugin description incorrectly says 2.3.1: opnsense/plugins#4003 (comment) So perhaps there's a difference in how miniupnpd controls the router / firewall itself, which causes the renewal to fail on my end. This is above my pay grade... |
i've been trying to build the package myself, and there's miniupnpd-nftables and miniupnpd-iptables, do you know which one you have? i see i have -iptables installed at the moment, it appears. Edit: i installed the -nftables variant and rebooted, and it's giving me |
No idea. OPNsense is built on FreeBSD and I think the plugin just relies on whatever miniupnpd comes with the OS:
I installed 2.3.6 from source and got the same Update: @laanwj |
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
const int s{socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE)}; | ||
if (s < 0) { | ||
LogPrintLevel(BCLog::NET, BCLog::Level::Error, "socket(AF_NETLINK): %s\n", NetworkErrorString(errno)); | ||
return std::nullopt; | ||
} | ||
Sock sock{static_cast<SOCKET>(s)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing the discussion from #30043 (comment), so that messages are grouped together, not scattered in the main PR thread.
The first two commits of #26812 would make it possible to test and fuzz how this code interacts with a router.
So ive been thinking about this, do we have a mockable way to do std::optional socket(int domain, int type, int protocol)? ...
Almost. Right now we have the "pointer"
std::function<std::unique_ptr<Sock>(const sa_family_t&)> CreateSock;
which in the real code points to the function
std::unique_ptr<Sock> CreateSockOS(sa_family_t address_family);
Tests that wish to mock the sockets redirect the pointer to another one CreateSockWhateverMockedStuff()
.
That address_family
is the first argument to the socket(2)
syscall. What's needed is to extend CreateSock*()
with the other two arguments - type
and protocol
. I can do that.
@Sjors, are you ACK on the first 2 commits of #26812? If yes, should I create a separate PR with those 2 commits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vasild I haven't had a chance to thoroughly review the second commit. Making a separate PR could help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
That address_family is the first argument to the socket(2) syscall. What's needed is to extend CreateSock*() with the other two arguments - type and protocol. I can do that.
Yes, for specifying UDP (and NETLINK, but i do not intend to make a test framework for that) it would need all three arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extended CreateSock()
in the topmost commit in https://github.com/vasild/bitcoin/commits/extend_CreateSock/.
Separate PR or include in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is already quite large, so let's make a fresh one that this can rebase on if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree adding fuzzing/testing makes sense as a follow-up PR. It's already big enough, and also i don't want to interfere with review by doing more active development here than address review comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #30202 "netbase: extend CreateSock() to support creating arbitrary sockets"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a bit confused about these two new PRs, but I guess they are orthogonal.
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
this should remove integer comparison warnings for all platforms.
I was able to build b4535d4 on my BSD 13.2 VM (errors and warnings are gone). Opening ports fails, maybe because it's a VM, but "Address family not supported by protocol family" seems an odd error for that.
On BSB 14.0 it seems to work fine:
(it doesn't actually get a mapping, but that's expected because VirtualBox doesn't come with PCP, afaik) |
Thanks for testing. Sounds like AF_NETLINK is not actually supported on that kernel, even though the necessary stuff is in the headers.
So let's bump the minimum to FreeBSD 14.0 instead?
It'd likely work if you set VirtualBox's VM networking to bridged instead of local NAT. |
On 13.2 it compiles but doesn't actually work at runtime due to AF_NETLINK error.
With a bridged network it indeed works. Also, as suggested by @vasild, doing |
Maybe too late or has been looked at. A PCP/NAT-PMP library https://github.com/libpcp/pcp |
Yes, some revievers have been looking at it as reference/comparison. This has been said before but: implementing a self-contained version of the RFCs here is intentional, we don't want to introduce a dependency on a third-party library. It shoudn't be needed for a straightforward fixed packet-based protocol. The intent is to have code that integrates with bitcoin core's frameworks for logging, networking and testing/fuzzing. At some point we want to be confident enough about it to enable it by default. |
Continues #30005. Closes #17012..
This PR adds PCP (Port Control Protocol) from RFC6887. This adds, in addition to the existing IPv4 port mapping (which now uses PCP, with fallback to NAT-PMP), support for IPv6 pinholing-that is, opening a port on the firewall to make it reachable.
PCP, like NAT-PMP is a simple UDP-based protocol, and the implementation is self-contained, so this gets rid of lthe libnatpnp dependency without adding a new one. It should otherwise be a drop-in replacement. NAT-PMP fallback is implemented so this will not make router support worse.
For now it is disabled by default, though in the future (not in this PR) we could consider enable it by default to increase the number of connectable nodes without adding significant attack surface.
To test:
(most of the changes in this PR are, ironically, removing the libnatpmp dependency and associated build system and build docs)
TODO