Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the branchless UTF-8 decoder by removing !len #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

danielthegray
Copy link

In your post, you say: "Adding that !len is actually somewhat costly,
though I couldn’t figure out why."

My suspicion was that it is because the "!" operator would essentially
behave like a branch, returning 1 if the input is 0 and 0 otherwise.

So, my idea was to copy the table of lengths you have and create another
one for "error lengths" to get that same effect (0 when it's OK and 1
when there is an error, to ensure that it moves forward at least one
byte, as mentioned).

The throughput went up from 504 MB/s to 557 MB/s on my machine.

In your post, you say: "Adding that !len is actually somewhat costly,
though I couldn’t figure out why."

My suspicion was that it is because the "!" operator would essentially
behave like a branch, returning 1 if the input is 0 and 0 otherwise.

So, my idea was to copy the table of lengths you have and create another
one for "error lengths" to get that same effect (0 when it's OK and 1
when there is an error, to ensure that it moves forward at least one
byte, as mentioned).

The throughput went up from 504 MB/s to 557 MB/s on my machine.
@N-R-K
Copy link

N-R-K commented Jun 23, 2022

For what it's worth, I actually see the speed drop from ~647 MB/s to ~611 MB/s with this patch applied on my system (3700x).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants