Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup: SIMD runtime detection #132

Merged
merged 1 commit into from Apr 20, 2023

Conversation

AaronO
Copy link
Contributor

@AaronO AaronO commented Apr 14, 2023

Also cleanup, builds off #131

We can see the overhead improvements in uri parsing for smaller values (where overhead is relatively significant) and we can see it compound in header/count accumulating the overhead of jumping in & out of SIMD.

header/count

1 2 4 8 16 32 64 128
Before 22 39 77 144 283 578 1092 2159
After 21 37 71 135 271 568 1025 2034

uri

1b 2b 4b 8b 16b 32b 64b 128b 256b 512b 1024b 2048b 4096b
Before 7 8 9 11 8 6 7 11 19 34 67 127 270
After 5 5 7 9 6 5 6 9 20 31 60 119 255

@seanmonstar
Copy link
Owner

cc @Noah-Kennedy

@AaronO
Copy link
Contributor Author

AaronO commented Apr 14, 2023

Small enum in lieu of func ptr is marginally better thanks to branch-prediction, observed on header/count:

test header/count_1 ... bench:          21 ns/iter (+/- 5)
test header/count_2 ... bench:          35 ns/iter (+/- 5)
test header/count_4 ... bench:          66 ns/iter (+/- 2)
test header/count_8 ... bench:         130 ns/iter (+/- 53)
test header/count_16 ... bench:         259 ns/iter (+/- 80)
test header/count_32 ... bench:         499 ns/iter (+/- 43)
test header/count_64 ... bench:         978 ns/iter (+/- 195)
test header/count_128 ... bench:        1938 ns/iter (+/- 116)

AaronO added a commit to AaronO/httparse that referenced this pull request Apr 15, 2023
First pass, building off seanmonstar#132
@AaronO AaronO mentioned this pull request Apr 15, 2023
src/simd/runtime.rs Outdated Show resolved Hide resolved
@AaronO
Copy link
Contributor Author

AaronO commented Apr 18, 2023

@seanmonstar Squashed to a single commit cleanup: simd runtime detection, since it's more of a cleanup than a perf improvement as we reverted to the atomic (which shouldn't be an issue in absolute but I would rather fine tune minimizing overhead of runtime feature detection in a separate PR)

@AaronO AaronO changed the title perf: SIMD runtime latency cleanup: SIMD runtime detection Apr 18, 2023
@seanmonstar
Copy link
Owner

I know when I originally added SIMD support to this crate, the is_x86_feature_detected! macro did not get inlined, so the function call was slower than caching in an atomic locally. Inline attributes were later added, so it could be that the cache is no longer worth keeping. Would be good to measure.

@AaronO
Copy link
Contributor Author

AaronO commented Apr 18, 2023

I know when I originally added SIMD support to this crate, the is_x86_feature_detected! macro did not get inlined, so the function call was slower than caching in an atomic locally. Inline attributes were later added, so it could be that the cache is no longer worth keeping. Would be good to measure.

I did assembly dumps and it is inlined. It still requires more finetuning and analysis that I think would be best addressed in its own PR.

@seanmonstar seanmonstar merged commit fbb0bdd into seanmonstar:master Apr 20, 2023
31 checks passed
@AaronO AaronO deleted the perf/simd-runtime-latency branch April 20, 2023 20:18
AaronO added a commit to AaronO/httparse that referenced this pull request Apr 20, 2023
First pass at neon support, building off seanmonstar#132
AaronO added a commit to AaronO/httparse that referenced this pull request Apr 25, 2023
First pass at neon support, building off seanmonstar#132
seanmonstar pushed a commit that referenced this pull request Apr 25, 2023
First pass at neon support, building off #132
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants