Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init: fixes file descriptor accounting #30065

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sr-gi
Copy link
Member

@sr-gi sr-gi commented May 8, 2024

The current logic for file descriptor accounting is pretty convoluted and hard to follow. This is partially caused by the lack of documentation plus non-intuitive variable naming (which was more intuitive when fewer things were accounted for, but
hasn't aged well). This has led to this accounting being error-prone and hard to maintain (as shown in the first commit of this PR).

Redefine some of the constants, plus document what are we accounting for so this can be extended more easily

Fixes #18911

@DrahtBot
Copy link
Contributor

DrahtBot commented May 8, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK TheCharlatan, BrandonOdiwuor
Stale ACK vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

src/init.cpp Outdated
Comment on lines 984 to 987
nMaxConnections = std::max(std::min<int>(nMaxConnections, fd_max - nBind - MIN_CORE_FILEDESCRIPTORS - MAX_ADDNODE_CONNECTIONS - NUM_FDS_MESSAGE_CAPTURE), 0);
if (nFD < MIN_CORE_FILEDESCRIPTORS)
return InitError(_("Not enough file descriptors available."));
nMaxConnections = std::min(nFD - MIN_CORE_FILEDESCRIPTORS - MAX_ADDNODE_CONNECTIONS - NUM_FDS_MESSAGE_CAPTURE, nMaxConnections);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice how the latter nMaxConnections trim is redundant once we limit nFD = std::min(FD_SETSIZE, nFD);. Moving things around, plus accounting for nBind (which is currently not being accounted for in master), we get:

nMaxConnections = std::max(std::min<int>(nMaxConnections, nFD - nBind - MIN_CORE_FILEDESCRIPTORS - MAX_ADDNODE_CONNECTIONS - NUM_FDS_MESSAGE_CAPTURE), 0);
nMaxConnections = std::min(nMaxConnections, nFD - nBind - MIN_CORE_FILEDESCRIPTORS - MAX_ADDNODE_CONNECTIONS - NUM_FDS_MESSAGE_CAPTURE,);

@TheCharlatan
Copy link
Contributor

Concept ACK

@sr-gi
Copy link
Member Author

sr-gi commented May 8, 2024

I think it is worth mentioning that the current approach is defining the minimum amount of file descriptions required without accounting for a single connection*, this means that if we are at the bare minimum, we will run but won't be able to connect to any nodes. This is consistent with the current logic in master, but I think it's not the way it should be.

I'm happy to add another commit addressing this, but I've rathered start approaching this sticking to the same assumptions as master.

* This actually accounts for manual connections, which may never happen, but not to any of our outbound. If we happen to not create any manuals we'd have 8 slots (MAX_ADDNODE_CONNECTIONS) for outbounds

@sr-gi
Copy link
Member Author

sr-gi commented May 8, 2024

@vasild you may be interested in this. I decided to fix it when seeing you extending the current logic in #29415

Copy link
Contributor

@BrandonOdiwuor BrandonOdiwuor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

src/init.cpp Outdated Show resolved Hide resolved
@vasild
Copy link
Contributor

vasild commented May 22, 2024

The current logic for file descriptor accounting is pretty convoluted and hard to follow.

Concept ACK, I stumbled on this recently in #29415

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK 55f16f5

Might as well squash the last commit 55f16f5 init: Remove FreeBSD workaround to #2695 into some of the previous ones that touch that line.

src/init.cpp Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated
#else
int fd_max = FD_SETSIZE;
// Reserve enough FDs to account for the bare minimum, plus any manual connections, plus the bound interfaces
int nMinRequiredFds = MIN_CORE_FDS + MAX_ADDNODE_CONNECTIONS + nBind;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to doc/developer-notes.md variables should use snake_case and no need to denote the type in the variable name (n for integer): min_required_fds.

But more importantly, the name nMinRequiredFds is misleading because we accept a smaller number - a bit below we check if (nFD < MIN_CORE_FDS) return error..., so it is not "minimum required". I think this variable here should be: min_required_fds = MIN_CORE_FDS + nBind, that check should be if (nFD < min_required_fds) return error and then:

-    nFD = RaiseFileDescriptorLimit(nMaxConnections + nMinRequiredFds);
+    nFD = RaiseFileDescriptorLimit(min_required_fds + MAX_ADDNODE_CONNECTIONS + nMaxConnections);

and

-    nMaxConnections = std::max(std::min<int>(nMaxConnections, nFD - nMinRequiredFds), 0);
+    nMaxConnections = std::max(std::min<int>(nMaxConnections, available_fds - min_required_fds - MAX_ADDNODE_CONNECTIONS), 0);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the if check should be updated, but I don't necessarily agree with splitting MAX_ADDNODE_CONNECTIONS from min_required_fds.

There is the outstanding question of whether the minimum required fds to run Core should account for some of the connections. Currently, we are not accounting for any (given the check is performed against MIN_CORE_FDS), but I think we should be accounting for at least, outbound + manuals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, no strong opinion whether to include or exclude MAX_ADDNODE_CONNECTIONS in min_required_fds.

I guess better to update the check. It is funny to have code like

min_required = ...;
if (available < 10) error

it should really be

min_required = ...;
if (available < min_required) error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed that, plus increased the min_required_fds to also account for automatic outbounds in the last commit. I'm open to discuss about the latter, but I think it's something we should also account for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added int auto_outbounds = std::min(MAX_OUTBOUND_FULL_RELAY_CONNECTIONS + MAX_BLOCK_RELAY_ONLY_CONNECTIONS + MAX_FEELER_CONNECTIONS, nUserMaxConnections); seems to be duplicating this logic, but not exactly:

bitcoin/src/net.h

Lines 1069 to 1073 in be100cf

m_max_automatic_connections = connOptions.m_max_automatic_connections;
m_max_outbound_full_relay = std::min(MAX_OUTBOUND_FULL_RELAY_CONNECTIONS, m_max_automatic_connections);
m_max_outbound_block_relay = std::min(MAX_BLOCK_RELAY_ONLY_CONNECTIONS, m_max_automatic_connections - m_max_outbound_full_relay);
m_max_automatic_outbound = m_max_outbound_full_relay + m_max_outbound_block_relay + m_max_feeler;
m_max_inbound = std::max(0, m_max_automatic_connections - m_max_automatic_outbound);

and I think this is getting complicated again. I would suggest to keep it simple - drop the last commit f569b06 init: fix min required fds check and account for outbounds

and only apply this change to 3982543 init: improves file descriptors accounting and docs:

-    if (available_fds < MIN_CORE_FDS)
+    if (available_fds < min_required_fds)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is duplicating part of that, given CConnman::m_max_automatic_outbound is private.

I agree it is adding more complexity, but not accounting for it in the minimum means that could be in a situation where no single outbound connection is accounted for, so the user is basically only connected to manuals.

I acknowledge that's fairly rare, but the two reasons why we are currently not facing this issue is, first, because the minimum is fairly low, so it would need a system with really limited resources to become an issue, and second, because If we were in that situation it would also need to be that case that the node is defining manual connections, otherwise the 8 reserved slots for manual would be available for automatic.

Despite that, this feels like bad design, so I'd argue accounting for automatic is something we should do (though not necessarily as done in the current approach)

Copy link
Contributor

@vasild vasild May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the node is connected to 8 manually defined outbound peers and 0 inbounds and 0 automatic outbounds, then it is still operational? No need to prohibit startup in this case if there is a chance that this might happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it though? If the user has not specified that he wants no automatic connections (-maxconnections != 0) I'd argue he'd expect them to happen.

In any case, I think this is such an edge case that it is not worth blocking the PR on it. If you have a strong opinion on this, I'm happy to drop it and squash the rest of the commit to one of the previous ones

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am somewhat ~0 on erroring in the case of 8manual/0in/0out, but my bigger anxiety with the latest commit is that it makes it confusing again (or too complicated in my judgement). I tried to code a simple way to account for the automatic outbounds but gave up. Also, most naturally, as the "minimum required" name suggests, the check should be if (... < min_required). I find if (... < min_required + something more) confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the "making it more confusing". Given this is an edge case, I'll remove the outbound count and we can account for it if someone has a better way of doing so, or if someone opens an issue about it.

src/init.cpp Outdated Show resolved Hide resolved
We are computing our file descriptors limits based on whether we use
poll or select. However, we are taking that into account only partially
(subtracting from fd_max in one case, but from nFD later on). Moreover,
nBind is also only accounted for partially.

Simplify and fix this logic
@sr-gi
Copy link
Member Author

sr-gi commented May 24, 2024

Thanks for the suggestions @vasild, I've partially covered them and commented on some of the pending ones

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25385987025

@sr-gi sr-gi force-pushed the 2024-05-fdcount branch 2 times, most recently from fe92353 to fa42bf7 Compare May 24, 2024 15:59
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa42bf7

This is already an improvement over master. Would be nice to resolve #30065 (comment) as well.

sr-gi added 3 commits May 31, 2024 09:40
The current logic for file descriptor accounting is pretty convoluted and hard
to follow. This is partially caused by the lack of documentation plus non-intuitive
variable naming (which was more intuitive when fewer things were accounted for, but
hasn't aged well). This has led to this accounting being error-prone and hard to maintain
(as shown in he previous commit).

Redefine some of the constants, plus document what are we accounting for so this can be
extended more easily

Remove FreeBSD workaround to bitcoin#2695
We were setting a minimum required file descriptor count but testing against
a smaller value (notice this is independent from the refactor, it was already
the case).
Also, while automatic outbounds were accounted for in nUserMaxConnections, they
were not used to check against our minimum required number of fds.
@sr-gi
Copy link
Member Author

sr-gi commented May 31, 2024

ACK fa42bf7

This is already an improvement over master. Would be nice to resolve #30065 (comment) as well.

Addressed all comments now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve minimum file descriptor accounting and documentation
7 participants