-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
32-bit ARM very slow xxhash #505
Comments
A classical issue with There are a few things that can be attempted here. On the library side, there are multiple access methods which are provided: The most aggressive is A bit safer, This may require specifying some additional compilation flag. Not sure if it's an option, but Finally, on the user side
edit : fixed memory access value, as underlined by @easyaspi314 |
If I am not mistaken, the correct general purpose flags for that CPU would be this:
However, as Yann said, I would also recommend Clang as it tends to generate better code, especially with NEON. Additionally, did you try a newer GCC version?, GCC's ARM and AArch64 backends are rather mediocre, and it was pretty bad until recently. |
@Cyan4973 btw, |
Seems that compiler optimizations was not turned on correctly by the tool chain. When setting those on explicitly for the build, xxhashes are calculated as (fast as) they should. Thanks @Cyan4973 and @easyaspi314. |
Compiled with gcc 4.8.5 and tested on dual core 32-bit ARM Cortex-A9, all the xxhash algorithms are very slow and lose even crc32 implementation, regardless the input size. Tested from dev and v0.8.0, does not make a difference. NEON path is activated.
lscpu:
Flags: half thumb fastmult vfp edsp neon vfpv3 tls vfpd32
Seems that these codes are not so tested with 32-bit ARM?
The text was updated successfully, but these errors were encountered: