Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport: Adds SipHashUint256Extra and moves CSipHasher to it's own file in crypto/ directory #3496

Draft
wants to merge 11 commits into
base: 1.15.0-dev
Choose a base branch
from

Conversation

xanimo
Copy link
Member

@xanimo xanimo commented Mar 27, 2024

This is a move-only commit with the exception of changes to includes.

This adds SipHashUint256Extra:
Cherry-picked from: 7e00322

And moves CSipHasher to it's own file in crypto/ directory:
Cherry-picked from: 4fb789e

sipa and others added 9 commits March 23, 2024 16:21
These are available in sandboxes without access to files or
devices. Also [they are safer and more straightforward](https://en.wikipedia.org/wiki/Entropy-supplying_system_calls)
to use than `/dev/urandom` as reading from a file has quite a few edge
cases:

- Linux: `getrandom(buf, buflen, 0)`. [getrandom(2)](http://man7.org/linux/man-pages/man2/getrandom.2.html)
  was introduced in version 3.17 of the Linux kernel.
- OpenBSD: `getentropy(buf, buflen)`. The [getentropy(2)](http://man.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man2/getentropy.2)
  function appeared in OpenBSD 5.6.
- FreeBSD and NetBSD: `sysctl(KERN_ARND)`. Not sure when this was added
  but it has existed for quite a while.

Alternatives:

- Linux has sysctl `CTL_KERN` / `KERN_RANDOM` / `RANDOM_UUID`
  which gives 16 bytes of randomness. This may be available
  on older kernels, however [sysctl is deprecated on Linux](https://lwn.net/Articles/605392/)
  and even removed in some distros so we shouldn't use it.

Add tests for `GetOSRand()`:

- Test that no error happens (otherwise `RandFailure()` which aborts)
- Test that all 32 bytes are overwritten (initialize with zeros, try multiple times)

Discussion:

- When to use these? Currently they are always used when available.
  Another option would be to use them only when `/dev/urandom` is not
  available. But this would mean these code paths receive less testing,
  and I'm not sure there is any reason to prefer `/dev/urandom`.

Closes: #9676
Move the OS random test to a sanity check function that is called every
time bitcoind is initialized.

Keep `src/test/random_tests.cpp` for the case that later random tests
are added, and keep a rudimentary test that just calls the sanity check.
If the code was compiled with newer (>=3.17) kernel headers but executed
on a system without the system call, every use of random would crash the
program. Add a fallback for that case.
@xanimo xanimo changed the title Extract CSipHasher to it's own file in crypto/ directory. backport: Extract CSipHasher to it's own file in crypto/ directory. Mar 27, 2024
@xanimo

This comment was marked as resolved.

@patricklodder patricklodder requested a review from a team March 27, 2024 15:32
@patricklodder patricklodder added this to the 1.15.0 milestone Mar 27, 2024
@patricklodder
Copy link
Member

Question before concept ACK: this was done upstream to not have to include crypto/hash.h but be able to just include the Siphash header on its own. I don't see any includes being removed though. Is this because we are indirectly including crypto/hash.h everywhere?

@xanimo
Copy link
Member Author

xanimo commented Mar 27, 2024

Hmm, I'll take a look at that I'm not sure atm. I picked this one because it's where SipHashUint256Extra comes into the code space. 6268515#diff-70a1abe85d9d9fa8729884b20556bcf4f8ce94f50a051df85cccd6b9ca873078R134-R173

@xanimo
Copy link
Member Author

xanimo commented Mar 27, 2024

Edit: it's actually added here but for the sake of brevity I chose this one.

@patricklodder
Copy link
Member

I picked this one because it's where SipHashUint256Extra comes into the code space.

Not related to whether we want to split off this file, but: SipHashUint256Extra is used to hash over a uint256 and additional bytes for utxos. This is not used in any new code in #3229. What do we need that function for right now?

Switching utxo tracking is a consensus code change that should be executed with extreme caution. I don't mind doing it, but then it really needs a plan.

@xanimo
Copy link
Member Author

xanimo commented Mar 27, 2024

I believe it's first needed for picking xanimo@b8c03af part of https://github.com/bitcoin/bitcoin/pull/10321/commits unless we want to remove that test?

Yeah wasn't my intention to touch any utxo tracking / consensus code. I still have a fair bit of that work already done though during my research for torv3.

@patricklodder
Copy link
Member

We cannot backport scripted diff commits in 99% of the cases anyway, because code has diverged. The only usable thing from these could be the script, but even these don't always work, so those need TLC.

Right now, your commit (and a400df2 from the branch you linked) is invalid because you backported the code into the scripted diff:

$ contrib/devtools/commit-script-check.sh 1.15.0-dev..b8c03afdb1
Running script for: a400df2a77bc0888b57a7bd76bfa10dc0b66355e
sed -i 's/insecure_rand/local_rand_ctx/g' src/test/cuckoocache_tests.cpp
diff --git a/src/test/cuckoocache_tests.cpp b/src/test/cuckoocache_tests.cpp
index 6331f7e68..916b23cee 100644
--- a/src/test/cuckoocache_tests.cpp
+++ b/src/test/cuckoocache_tests.cpp
@@ -62,8 +62,7 @@ BOOST_AUTO_TEST_CASE(test_cuckoocache_no_fakes)
 {
     local_rand_ctx = FastRandomContext(true);
     CuckooCache::cache<uint256, uint256Hasher> cc{};
-    size_t megabytes = 4;
-    cc.setup_bytes(megabytes << 20);
+    cc.setup_bytes(32 << 20);
     uint256 v;
     for (int x = 0; x < 100000; ++x) {
         insecure_GetRandHash(v);
Failed
Running script for: b8c03afdb1b063b2fdac549b72fca44065314f58
sed -i "s/\<GetRandHash(/insecure_rand256(/" src/test/*_tests.cpp
sed -i "s/\<GetRand(/insecure_randrange(/" src/test/*_tests.cpp src/test/test_bitcoin.cpp
sed -i 's/\<insecure_rand() % \([0-9]\+\)/insecure_randrange()/g' src/test/*_tests.cpp
diff --git a/src/test/hash_tests.cpp b/src/test/hash_tests.cpp
index 4e88a0ea5..d8de765db 100644
--- a/src/test/hash_tests.cpp
+++ b/src/test/hash_tests.cpp
@@ -128,23 +128,6 @@ BOOST_AUTO_TEST_CASE(siphash)
     tx.nVersion = 1;
     ss << tx;
     BOOST_CHECK_EQUAL(SipHashUint256(1, 2, ss.GetHash()), 0x79751e980c2a0a35ULL);
-
-    // Check consistency between CSipHasher and SipHashUint256[Extra].
-    FastRandomContext ctx;
-    for (int i = 0; i < 16; ++i) {
-        uint64_t k1 = ctx.rand64();
-        uint64_t k2 = ctx.rand64();
-        uint256 x = insecure_rand256();
-        uint32_t n = ctx.rand32();
-        uint8_t nb[4];
-        WriteLE32(nb, n);
-        CSipHasher sip256(k1, k2);
-        sip256.Write(x.begin(), 32);
-        CSipHasher sip288 = sip256;
-        sip288.Write(nb, 4);
-        BOOST_CHECK_EQUAL(SipHashUint256(k1, k2, x), sip256.Finalize());
-        BOOST_CHECK_EQUAL(SipHashUint256Extra(k1, k2, x, n), sip288.Finalize());
-    }
 }
 
 BOOST_AUTO_TEST_SUITE_END()
Failed

@patricklodder
Copy link
Member

Alright, so concept ACK on moving it. But should keep it move-only as the commit says. I'm not sure how urgent the move is without the added code though, I leave that up to you.

sipa and others added 2 commits March 27, 2024 15:00
We'll need a version of SipHash for tuples of 256 bits and 32 bits
data, when CCoinsViewCache switches from using txids to COutPoints as
keys.
This is a move-only commit with the exception of changes to includes.

Cherry-picked from: 4fb789e
@xanimo xanimo force-pushed the 1.15.0-dev-extract-csiphasher branch from 6268515 to 4aa8403 Compare March 27, 2024 22:09
@xanimo xanimo changed the title backport: Extract CSipHasher to it's own file in crypto/ directory. backport: Adds SipHashUint256Extra and moves CSipHasher to it's own file in crypto/ directory Mar 27, 2024
@xanimo
Copy link
Member Author

xanimo commented Mar 27, 2024

I'm not sure we actually need this at the moment but have consolidated both SipHashUint256Extra along with moving siphash to it's own files in src/crypto/. Moving to draft to give me time to check the crypto/hash.h includes. Also fine with closing.

@xanimo xanimo marked this pull request as draft March 27, 2024 22:24
@patricklodder
Copy link
Member

Here's my thoughts on this: IF someone wants to change the granularity of CCoinsViewCache to be at the level of utxo instead of transaction, we'll have to quantify the benefits (maybe/hopefully faster input validation) and costs (disk space, memory) for Dogecoin in 2024 as this may very well be different than it was for Bitcoin in 2017 when this change was made there.

Messagehandler thread load is a thing right now though, because during spam, Dogecoin sees 4-5x the traffic Bitcoin does, and my best nodes haven't been able to serve 125 peers well during these times so I've had to reduce that. If there are gains in validation (cpu) costs to be made with such a change, it would be good to look at it, as long as we can get a clear picture of the consequences.

The addition of SipHashUint256Extra should be part of that solution, but I'd suggest to do that as part of back porting the entire change though, because although that's a big one, it is one of those "all or nothing" things.

In addition to this, a node with NODE_BLOOM enabled (>95% of all connectable nodes my spider sees) will also take a lot more time validating. If we're to both backport the blockfilter (BIP158 and BIP157) and the subsequent change to use clean async indexers outside of immediate consensus, then having a separate siphash per the 2nd commit makes sense (and that's what that particular commit was made for in the first place.)

Although on my own nodes I do not observe many clients that use bloom filters nowadays, if those were to become more commonplace, having that async, deterministic filter set up would help at least not degrading the Messagehandler thread further, as the filters can be constructed async on a separate thread and serve'em-when-we-get'em.

For any node that runs txindex, this can help with performance too.

So bottom line, I'd suggest to leave this in draft for now, and it can be turned into either of the above solutions.

@patricklodder patricklodder removed this from the 1.15.0 milestone Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport cleanup Cleans up code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants