New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scan dir once #658
base: master
Are you sure you want to change the base?
Scan dir once #658
Conversation
So you you have both /var and /var/log monitored, the /var scan won't decend into /var/log. /var/log is only scanned once, with the right options. For example, this would only scan /etc/sysconfig with report changes. <directories check_all="yes">/etc</directories> <directories check_all="yes" report_changes="yes">/etc/sysconfig</directories> For example, this would avoid summing the log files. <directories check_all="yes">/var</directories> <directories check_owner="yes" check_perm="yes" check_sum="no">/var/log</directories>
It seems that this changes the default behavior. Wouldn't it be best to add it as a configurable option? |
That makes sense. Do you think it should be a compile time option, or a check_something option to the directories tag, or both? |
This reverts commit c14f21a.
This reverts commit d0c1a4b.
This reverts commit 9039302.
This reverts commit ad8514b.
This reverts commit b7e9ebe.
…a side effect" This reverts commit e8ed521.
…also target directories. So if you have both /var and /var/log, and apply this option to /var, then /var/log will only be scanned once. This make a huge disk I/O difference if it gets into your logs. It also lets you apply different options to a subdirectory, ie check_sum="no".
I would expect something like |
Since the current behavior of defining directories multiple times produces undesirable results, I don't know if this really needs a configuration option. I haven't tested this pull, but if it fixes the interaction, I'm fine with it changing a currently broken operation. |
Perhaps I am misunderstanding this. If I currently have /etc defined, then I'll receive alerts if /etc/foo/bar changes. With this patch, wouldn't I have to manually enumerate all current and future subdirectories and define them individually? Wouldn't I have to maintain the configuration file manually as people add directories? |
The way I understand it (again, I haven't looked at it much) is that you can define /etc to check_all, but /etc/important_file to change ownership. The size and md5 of important_file wouldn't cause alerts, but a change in owner would. Everything else should be monitored as before. |
My intent is very close to what Dan described, except it just applies to subdirectories, not files (that might make sense as future work). syscheck currently itterates through the list of directories. On each one, it does a recursive scan of all subdirectories and files, similar to unix find(1). If you have both /etc and /etc/important_dir defined in the config, the itteration loop will do a find in each. So things in /etc/important_dir get scanned twice, and entered into the database twice. This change is supposed to let it realize that it has a listed starting directory of /etc/important_dir, and skip that while scanning /etc. Then /etc/important_dir will be scanned later when the itteration loop gets to it. All of it's sibling directories under /etc would be scanned as normal. This lets you have different check options on /etc/important_dir, ie report_changes. It also lets you scan all of /var, but avoid checksumming the big log files. That was my main reason for writing this, syscheck was eating most of my disk I/O bandwidth, repeatedly sha1 and md5 summing my tomcat logs. I've added a skip_subdir="yes" option to the directories tag. That lets people decide if they want this or not. This does use more cpu to do the directory name comparisons. |
I wasn't clear about the cpu usage. This change uses more cpu to compare each subdirectory found with the list of top level starting directories. So having it optional can reduce the cpu used by syscheck. This works by having the recursive filesystem crawl strcmp() every directory if finds with the list of target directories. That could be made more efficent, at a cost in complexity. Maybe I should do the subdir checks between the top level starting directories at the beginning and keep a list of subdirs of each dir? Then while crawling /usr, I wouldn't be tring to strcmp /usr/local, /usr/share, ... with /etc, /etc/important_dir, /usr ... |
This sounds good. |
I think ignore in syscheck was fixed somewhere between 2.7.1 and 2.8.2. In 2.8.2, ignore seems to apply to syscheck scanning, not just reporting. |
@doke2 That depends where you apply it. On the manager-side, it is reporting only. On the agent side, it is ignored on the scanning time. |
Don't decend into monitored subdirectories of monitored directories.
If you you have both /var and /var/log monitored, the /var scan won't
decend into /var/log. /var/log is only scanned once, with the
right options.
For example, this would only scan /etc/sysconfig with report changes.
<directories check_all="yes">/etc</directories>
<directories check_all="yes" report_changes="yes">/etc/sysconfig</directories>
For example, this would avoid summing the log files.
<directories check_all="yes">/var</directories>
<directories check_owner="yes" check_perm="yes" check_sum="no">/var/log </directories>