Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify replace_bytes in line_buffer #2729

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

riquito
Copy link

@riquito riquito commented Feb 4, 2024

This PR removes an unnecessary loop from replace_bytes in line_buffer.

I also added tests for some cases not covered (empty string, same byte replaced, nothing found in a non-empty string).

Unless it's an optimization that I fail to grasp, it should be an oversight from last refactoring
f7ff34f

@BurntSushi
Copy link
Owner

f7ff34f did not introduce the loop though.

I was curious, so I went back through the history, and this inner loop has been present since it was written initially in 2016:

ripgrep/src/search.rs

Lines 555 to 569 in 812cdb1

/// Replaces a with b in buf.
fn replace_buf(buf: &mut [u8], a: u8, b: u8) {
if a == b {
return;
}
let mut pos = 0;
while let Some(i) = memchr(a, &buf[pos..]).map(|i| pos + i) {
buf[i] = b;
pos = i + 1;
while buf.get(pos) == Some(&a) {
buf[pos] = b;
pos += 1;
}
}
}

To be clear, it wasn't an oversight. That inner loop is indeed an optimization. It works well because binary data tends to have long runs of NUL terminators, and it is slower to stop and start memchr for every byte in a sequence. Its latency is worse than just comparing one-byte-at-a-time.

The extra tests are nice. And you're right that from a correctness perspective, the extra loop is superfluous. So that means a comment saying why it's there would be great to add. Would you like to do that?

@riquito
Copy link
Author

riquito commented Feb 4, 2024

Thanks, I learned something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants