You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Theses three sections of code are needed to fix CVE-2023-25568, they make the server ignore any new queries overflowing 1024. Previously the server would remember infinitely many CIDs eventually OOMing.
However the truncation code always prefer keeping existing entries over new ones.
This means we can get in a situation where we get stuck:
let's say the client send a 2048 entries wantlist.
out of theses we only have 1/3 of the CIDs, and theses are uniformly distributed.
the server truncate the wantlist to 1024.
the server starts serving theses entries first, out of theses we don't have 683 CIDs.
we now only have 341 effective wantlist size. This is because the bitswap server never cleanup entries after sending DONT_HAVE.
The point of this feature is for -1 scalling, if the server is also downloading the same blocks, it might get them after having already sent DONT_HAVE, then the server can either send the block or the a HAVE message overriding the previous DONT_HAVE.
This repeat each time shrinking the usable wantlist on this connection because the part of the wantlist the server is willing to keep fills up with CIDs it does not have.
Eventually reaching 0
(note: the client can send CANCEL or a message with the full flag and theses 1024 "stuck" CIDs will be cleaned out properly, but the client isn't smart enough to realize this is happening)
I see three possible solutions:
Truncate left instead of truncating right.
This means newer entries would override older ones, instead of having older ones stick around.
My original thinking when writing the fix is that the server currently does not apply back-pressure, so we should let existing entries first so we can make some progress and clear them out when we send responses. If we always cancel what is already in flight a client that is too fast would never get any blocks because everything would be canceled before being sent.
This could be fixed by only canceling what is already queued but not entries that are actively being worked on.
Completely rewrite the server and architecture it request response
If we handled messages in a request response manner we could apply back-pressure when the client is sending too many queries.
We could still have a global CID → [PeerID] wantlist map however this would be limited to CIDs we don't have, for -1 scalling. CIDs we already have would be completely skip this flow and the queue and be handled purely in the message handler.
We would still need a limit on how many -1 tracked CIDs we have, but that means this bug would only impact blocks we don't have which SGTM.
Make the client aware and handle rotation itself by truncating itself and handle canceling overwrote entries.
This is nice because it does not require updating the servers, only the client, so fixed client can download from buggy servers.
Sounds PITA to code and I don't really want to do it tbh. Also this means we commit to having broke the protocol.
The text was updated successfully, but these errors were encountered:
Jorropo
added
need/triage
Needs initial labeling and prioritization
P1
High: Likely tackled by core team if no one steps up
kind/bug
A bug in existing code (including security flaws)
and removed
need/triage
Needs initial labeling and prioritization
labels
Dec 19, 2023
Jorropo
changed the title
Bitswap server fails in a toxic maner when the client exceed 1024 entries in the wantlist by a *lot* preventing any data transfer
bitswap/server: wantlist overflows fails in a toxic maner preventing any data transfer
Dec 19, 2023
I don't think it was written down anywhere, but me and @aschmahmann had verbal conversation
that as a compromise, to avoid rewriting everything, we should try:
implementing "Truncate left instead of truncating right" (make new entries override old ones)
with metric that shows how full the queue is (if we don't have it)
with configuration option to override default MaxQueuedWantlistEntiresPerPeer (allowing operators to tune their setup to have queue match increased load)
This is a bug I introduced in 9cb5cb5 when fixing CVE-2023-25568.
Here is a screenshot of our gateway's wantlists:
You can see that many gateways instances have more than 1024 entries.
This cause issues with:
boxo/bitswap/server/internal/decision/engine.go
Lines 669 to 677 in 9cb5cb5
boxo/bitswap/server/internal/decision/engine.go
Line 765 in 9cb5cb5
boxo/bitswap/server/internal/decision/engine.go
Line 851 in 9cb5cb5
Theses three sections of code are needed to fix CVE-2023-25568, they make the server ignore any new queries overflowing 1024. Previously the server would remember infinitely many CIDs eventually OOMing.
However the truncation code always prefer keeping existing entries over new ones.
This means we can get in a situation where we get stuck:
DONT_HAVE
.The point of this feature is for
-1
scalling, if the server is also downloading the same blocks, it might get them after having already sentDONT_HAVE
, then the server can either send the block or the aHAVE
message overriding the previousDONT_HAVE
.Eventually reaching 0
(note: the client can send CANCEL or a message with the full flag and theses 1024 "stuck" CIDs will be cleaned out properly, but the client isn't smart enough to realize this is happening)
I see three possible solutions:
Truncate left instead of truncating right.
This means newer entries would override older ones, instead of having older ones stick around.
My original thinking when writing the fix is that the server currently does not apply back-pressure, so we should let existing entries first so we can make some progress and clear them out when we send responses. If we always cancel what is already in flight a client that is too fast would never get any blocks because everything would be canceled before being sent.
This could be fixed by only canceling what is already queued but not entries that are actively being worked on.
Completely rewrite the server and architecture it request response
If we handled messages in a request response manner we could apply back-pressure when the client is sending too many queries.
We could still have a global
CID → [PeerID]
wantlist map however this would be limited to CIDs we don't have, for-1
scalling. CIDs we already have would be completely skip this flow and the queue and be handled purely in the message handler.We would still need a limit on how many
-1
tracked CIDs we have, but that means this bug would only impact blocks we don't have which SGTM.Make the client aware and handle rotation itself by truncating itself and handle canceling overwrote entries.
This is nice because it does not require updating the servers, only the client, so fixed client can download from buggy servers.
Sounds PITA to code and I don't really want to do it tbh. Also this means we commit to having broke the protocol.
The text was updated successfully, but these errors were encountered: