New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Meow hash in the benchmarks #155
Comments
meow_hash entirely depends on the presence of hardware-accelerated AES instructions. I suspect it won't even compile in absence of these instructions, as the code is pretty short, and I don't see any software backup code path. Besides, all AES instructions I could read are using direct Intel intrinsic, so it's not portable across architecture, irrespective of hardware capabilities. It's not necessarily a "bad" thing : after all, they get great speed for long input in return. But the main point is : with reliance on such specialized hardware instruction set, portability is out of the feature list. As a side-effect, it would not even run on the platform used for xxHash benchmarks. Maybe I should spend some time documenting this. |
This is no longer true you should revisit this. |
Yeah, reading the announcements, meow has improved a lot in a few revisions. I'll need some time to get a deeper look into it. I just made a cursory look, and couldn't find a software backup path. Maybe it's there and I just missed it. |
For the reference, this is against my latest version, 2.0 GHz Core i7 Gen 2 (Sandy Bridge) Clang 7.0.1
With
|
...aaand in Meow 0.5 it's true again; in the latest version they again only have x64 AES code path :) |
So yeah, Meow Hash is kinda specific to x86_64 and Nehalem+, or it has to switch to a slow af fallback version. Meanwhile, XXH3 has vectorized code paths for all x86_64 (including Core 2 and friends), Pentium 4+ (which is required for Windows 7+), ARMv7-A w/NEON (available on most Androids and all iOS devices except the original iPhone), ARM64 (all recent iPhones and Androids), VSX POWER9 (many servers and supercomputers have these, but it is a niche market), and if that wasn't enough, even the scalar version is still very fast even on 32-bit targets, given they have a decent multiplier. |
I'm starting to gather some more benchmark results, as there were several requests of this kind over the years.
That being said, A worrying development is that there is a discrepancy in "how to represent and interpret results". As a consequence, it's unclear if a reader, after finding the first performance figure in the prominent top table, will bother looking below for additional graphs, thus guiding its choice while potentially missing an important information for its use case. Another issue is about the graphs themselves. A sorted table can be extended, relatively easily. There are limits, sure, but generally speaking, adding a contender adds just a line. Adding a paragraph containing "raw" benchmark data, while commendable, is a poor substitute. Therefore, I was considering "summarizing" the small data test outcome, and create a "performance number" that can be used in a ranked table, to give at least a hint of general performance on small data. Interested readers would still have to look down for more accurate information (graphs), but at least, the notion that some algorithms are more suitable for small data than others can be conveyed in a relatively simple manner. There is still the issue that graphs can only represent a limited nb of contenders, and maybe the one that a reader wants to observer isn't part of the selected list (even when raw data is available). I was wondering if there would be a better way to represent graphs. Something more dynamic than a prepared screenshot, allowing a user to select which contenders should be visible. |
meow hash added in comparison : Adding an entry to the table is fine. |
https://github.com/cmuratori/meow_hash
The text was updated successfully, but these errors were encountered: