Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output the progress once more when the whole scan process finished #828

Merged
merged 3 commits into from Mar 18, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/monitor.c
Expand Up @@ -469,7 +469,7 @@ void monitor_run(iterator_t *it, pthread_mutex_t *lock)
int_status_t *internal_status = xmalloc(sizeof(int_status_t));
export_status_t *export_status = xmalloc(sizeof(export_status_t));

while (!(zsend.complete && zrecv.complete)) {
while (1) {
update_pcap_stats(lock);
export_stats(internal_status, export_status, it);
log_drop_warnings(export_status);
Expand All @@ -487,6 +487,9 @@ void monitor_run(iterator_t *it, pthread_mutex_t *lock)
if (status_fd) {
update_status_updates_file(export_status, status_fd);
}
if (zsend.complete && zrecv.complete) {
Copy link
Contributor

@droe droe Mar 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to guarantee a final status line based on stats in completed state, then there's a race here. The scan can be still incomplete when we obtain stats on line 473 and subsequently produce status lines based on those stats, and then complete on the recv thread just before monitor thread execution reaches line 490, in which case the loop is exited without a final status line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. It appears that I may not have fully grasped your point.

Currently, ZMap does not correctly update the status file after completing the scan (see #823 ).

My intention is to ensure that, once ZMap's scan is finished, it will refetch the current scanning status one more time and output the last line of the status file. I am not quite familiar with the code base, so my changes maybe naive somehow.

To address this issue, do you have any further suggestions for improvement? I am more than willing to continue assisting in resolving this issue.

Hey, @zakird and @phillip-stephens, any suggestions on these tweaks?

Thank you guys once again!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zsend.complete and zrecv.complete are global vars that can be set at any time on other threads running concurrently to the monitor thread. Their values can change basically at any point in time. There is a lock taken across updating of the stats, but the lock does not protect zrecv.complete. I'm suggesting that this change makes it much more likely that the last status line has completed stats, by moving the sleep to after the condition check, but that it still does not guarantee it.

One way to avoid the race without reworking any of the locking would be to read zsend.complete && zrecv.complete into a local var at the beginning of the loop, and then breaking out of the loop based on that local var. Then we only break out of the loop when we know that all send and recv threads had completed already before updating stats and writing the status lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. I have a few questions to better understand your comments.

  1. You mentioned that zsend.complete and zrecv.complete are global vars that can be set at any time on other threads running concurrently to the monitor thread. This makes me wonder if the original zmap code src/monitor.c#L472 might have a race condition issue similar to the one you described.

  2. Intuitively the variable zsend.complete and zrecv.complete, they should be monotonic, which means they change only from an incomplete to a complete state, and never revert from complete to incomplete. With this in mind, I believe the current modifications should be safe, as once both variables reach the complete state, they should not be altered by other threads.

  3. About your comment "by moving the sleep to after the condition check, but that it still does not guarantee it," I havn't change the callsite of the sleep function, it remains after the condition check in my latest revision. Could you please specify what additional steps you recommend to ensure the effectiveness of this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, zsend.complete and zrecv.complete are monotonic in the sense that they only go from 0 to 1. Correct, the issue existed before your change, your change does improve the situation, vastly, but seems like an incomplete fix in that it works as expected most of the time, but it does not guarantee that the last status line written was based on counters in their final state after completion.

The corner case I'm (only a little) concerned about:

  1. recv thread is still executing, zrecv.complete is 0.
  2. monitor thread executes (new) lines 474 to 490, outputting a line to the status file based on non-completed status counters.
  3. monitor thread gets preempted, or, say, file I/O takes unexpectedly long.
  4. recv thread completes and sets zrecv.complete to 1.
  5. monitor thread executes line 491, observes zrecv.complete being 1, breaks the loop. The last status line is still based on older counters before completion.

In practice, especially with default cooldown, not a big deal. Still, I would propose to read zsend.complete && zrecv.complete into a local var at the beginning of the loop, and break based on the local var on line 491, knowing that then, when we are complete, we were already complete before outputting status, which means status was based on counters after completion. If completion happens during production of the status line (during execution of lines 474 to 490), we take the loop once more, produce another full status line, then exit after a complete line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, I will make an effort to implement improvements based on your suggestions.

break;
}
sleep(UPDATE_INTERVAL);
}
if (!zconf.quiet) {
Expand Down