Add SipHash to benchmarks...? #162

boazsegev · 2019-01-03T03:36:32Z

I just discovered this repo, and I love it 👍

I'm very impressed by both the function and the performance.

I'm curious if SipHash could be added to the benchmark suites? It has it's own Wikipedia page and appears to be implemented in many platforms and used by some Standard Libraries.

I know the benchmarks are half implementation and half algorithm design, but both Hash implementations appear mature enough to make curious if pushing Ruby and Python to adopt xxHash over SipHash might be a good idea.

Thanks!

Cyan4973 · 2019-01-04T21:18:00Z

Sure, it's a good suggestion.

siphash is portable (I believe), which is the main requirement to be comparable.
I suspect it is at a disadvantage for raw speed, though its main selling point is that it offers improved cryptographic protection regarding collision generation.

I need to find time to build a new benchmark platform for this though.

boazsegev · 2019-01-05T09:34:24Z

Actually, I found some initial tests at this SMHasher fork, showing xxHash is at least twice as fast for small inputs and far faster for longer inputs.

However, I think this won't be enough...

[SipHash's] main selling point is that it offers improved cryptographic protection regarding collision generation.

I think this is bigger a requirement than I realized, especially due to the risks related to flood-hashing attacks (for anyone else reading this, here is a high-level explanation and a more detailed one).

I found this thread indicating that it's possible to produce collisions on demand.

I don't know if it's possible to use this weakness to create multi-collisions and lead to hash flooding attacks... but I think a resistance to collision generation should be considered a requirement for widespread adoption.

...

As a security related side note, by setting the first 32 bytes of a message to a specific value, all vectors are equal to the seed after the first round.

This might be exploited to extract seed data or compromise the hash. I'm playing around with my own variation where I use irreversible manipulations to prevent this.

Cyan4973 · 2019-07-25T23:45:06Z

As I'm going to make a new round of benchmark for the next xxHash release, I may as well add siphash in the list.

boazsegev · 2019-07-26T21:19:13Z

FYI:

I was inspired by XXHash's approach and used something similar in my own project / library (I named it RiskyHash - you can find the source code here).

BTW:

The more I read the less I'm impressed with the need for Hash function "security" where Hash Maps are concerned. IMHO, the Hash Map implementation should handle security concerns, not the Hashing function. Following this logic, using a faster Hashing function could add a significant performance boost.

easyaspi314 · 2019-07-27T00:32:24Z

Yes, IMO, SipHash is overkill. Even with the "fast" 1-3 variant, its performance is less than half of XXH32.

You only need a decent hash:

The hash must be seedable. Any unseeded hash can be precalculated. Hashes have a virtually infinite number of collisions, it is just how easy it is to find them.
No seed-independent collisions. The MurmurHash incident was such a big deal because you could easily generate values which would collide, regardless of the key. It doesn't matter what seed you use, that hash is fucked.
It must have good distribution. This makes sure that the buckets are filled evenly, and makes it so you can resize to a power of 2 and avoid a modulus.
It must be fast. The whole point of a hash table is to be as fast as possible. If you are spending 64 cycles a byte, why bother?

Cyan4973 · 2020-06-22T21:28:08Z

Added siphash to
https://github.com/Cyan4973/xxHash/wiki/Performance-comparison

Cyan4973 added the documentation label Feb 10, 2020

Cyan4973 closed this as completed Jun 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SipHash to benchmarks...? #162

Add SipHash to benchmarks...? #162

boazsegev commented Jan 3, 2019

Cyan4973 commented Jan 4, 2019

boazsegev commented Jan 5, 2019

Cyan4973 commented Jul 25, 2019

boazsegev commented Jul 26, 2019

easyaspi314 commented Jul 27, 2019

Cyan4973 commented Jun 22, 2020

Add SipHash to benchmarks...? #162

Add SipHash to benchmarks...? #162

Comments

boazsegev commented Jan 3, 2019

Cyan4973 commented Jan 4, 2019

boazsegev commented Jan 5, 2019

Cyan4973 commented Jul 25, 2019

boazsegev commented Jul 26, 2019

easyaspi314 commented Jul 27, 2019

Cyan4973 commented Jun 22, 2020