New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(ignore): don't search subdirs for git/ignore files if max depth is reached #2565
Comments
To be clear this optimization only affects users which set |
The implementation should already do this: ripgrep/crates/ignore/src/walk.rs Lines 1493 to 1495 in 304a60e
If it's not, I'm not sure why. I personally won't have a chance to investigate this probably until I give |
That path seems to be only reached for |
Yes, that's because the single threaded case uses walkdir's |
The path that my example code take is this ripgrep/crates/ignore/src/walk.rs Line 1022 in 304a60e
|
Like I said, I haven't looked into this. I only know that |
@sigmaSd I've tried reproducing your steps but didn't hit the issue 🤔 Here's my last output while following your guide
|
@AgustinRamiroDiaz I'm not sure what steps are you following, you seem to be spawning rg for some reason |
@sigmaSd yeah, I forgot to clarify I couldn't follow this steps
because I wasn't getting a
and then ran
|
As shown above you have to create a directory with nested subdirectories, then you can strace the compiled program against the root dir |
When searching for git/ignore files (the one set with
.ignore
.git_ignore
.git_exclude
), the crate should not search the sub-directories if its at the max depthexample:
a.rs
you can see that we do a lot of unneeded syscalls because we're reading git/ignore files even though we won't search inside those directorates
this is the best case (empty dirs), in practice its worse because directory can have those files, and this crate will have to issue read syscalls for them
without that search, it would be only 3 syscalls per directory
suggestion:
I'm suggestion to not search for those files inside subdirs if we reach the max depth
Unless I'm missing something it should keep the exact same functionality while having all the performance boost
motivation:
you can checkout helix-editor/helix#7715 where disabling this behavior improved the walking from ~4000 syscalls to ~900 and from 20 seconds to 100 ms in my old pc with hdd drive
so ~3000 wasted syscalls for ~300 files
The text was updated successfully, but these errors were encountered: