Output the progress once more when the whole scan process finished #828

WangYihang · 2024-03-16T02:31:53Z

Introduction

As the issue (#823) montioned, the status output lacks the final line, which is crucial for certain monitoring systems that track scanning progress via status files. As a result, these systems are currently unable to accurately capture the completion of the scanning process.

For comprehensive background information, refer to issue #823.

Overview of Current Issue

Prior to the proposed changes, the status output format is as follows:

real-time,time-elapsed,time-remaining,percent-complete,hit-rate,active-send-threads,sent-total,sent-last-one-sec,sent-avg-per-sec,recv-success-total,recv-success-last-one-sec,recv-success-avg-per-sec,recv-total,recv-total-last-one-sec,recv-total-avg-per-sec,pcap-drop-total,drop-last-one-sec,drop-avg-per-sec,sendto-fail-total,sendto-fail-last-one-sec,sendto-fail-avg-per-sec
2024-03-16 10:24:50,0,11,0.249121,0.000000,4,2,1,68,0,0,0,0,0,0,0,0,0,0,0,0
...
2024-03-16 10:24:58,8,0,99.590841,19.531250,0,256,1,4009,50,0,6,200,50,25,0,0,0,0,0,0

Proposed Enhancement

With the proposed modifications, the status output format will be updated to include the final line, thereby providing a complete overview of the scanning progress:

real-time,time-elapsed,time-remaining,percent-complete,hit-rate,active-send-threads,sent-total,sent-last-one-sec,sent-avg-per-sec,recv-success-total,recv-success-last-one-sec,recv-success-avg-per-sec,recv-total,recv-total-last-one-sec,recv-total-avg-per-sec,pcap-drop-total,drop-last-one-sec,drop-avg-per-sec,sendto-fail-total,sendto-fail-last-one-sec,sendto-fail-avg-per-sec
2024-03-16 10:24:50,0,11,0.249121,0.000000,4,2,1,68,0,0,0,0,0,0,0,0,0,0,0,0
...
2024-03-16 10:24:58,8,0,99.590841,19.531250,0,256,1,4009,50,0,6,200,50,25,0,0,0,0,0,0

…#823)

droe · 2024-03-16T09:12:54Z

src/monitor.c

@@ -487,6 +487,9 @@ void monitor_run(iterator_t *it, pthread_mutex_t *lock)
 		if (status_fd) {
 			update_status_updates_file(export_status, status_fd);
 		}
+		if (zsend.complete && zrecv.complete) {


If the goal is to guarantee a final status line based on stats in completed state, then there's a race here. The scan can be still incomplete when we obtain stats on line 473 and subsequently produce status lines based on those stats, and then complete on the recv thread just before monitor thread execution reaches line 490, in which case the loop is exited without a final status line.

Thank you for your feedback. It appears that I may not have fully grasped your point.

Currently, ZMap does not correctly update the status file after completing the scan (see #823 ).

My intention is to ensure that, once ZMap's scan is finished, it will refetch the current scanning status one more time and output the last line of the status file. I am not quite familiar with the code base, so my changes maybe naive somehow.

To address this issue, do you have any further suggestions for improvement? I am more than willing to continue assisting in resolving this issue.

Hey, @zakird and @phillip-stephens, any suggestions on these tweaks?

Thank you guys once again!

zsend.complete and zrecv.complete are global vars that can be set at any time on other threads running concurrently to the monitor thread. Their values can change basically at any point in time. There is a lock taken across updating of the stats, but the lock does not protect zrecv.complete. I'm suggesting that this change makes it much more likely that the last status line has completed stats, by moving the sleep to after the condition check, but that it still does not guarantee it.

One way to avoid the race without reworking any of the locking would be to read zsend.complete && zrecv.complete into a local var at the beginning of the loop, and then breaking out of the loop based on that local var. Then we only break out of the loop when we know that all send and recv threads had completed already before updating stats and writing the status lines.

Thank you for your feedback. I have a few questions to better understand your comments.

You mentioned that zsend.complete and zrecv.complete are global vars that can be set at any time on other threads running concurrently to the monitor thread. This makes me wonder if the original zmap code src/monitor.c#L472 might have a race condition issue similar to the one you described.

Intuitively the variable zsend.complete and zrecv.complete, they should be monotonic, which means they change only from an incomplete to a complete state, and never revert from complete to incomplete. With this in mind, I believe the current modifications should be safe, as once both variables reach the complete state, they should not be altered by other threads.

About your comment "by moving the sleep to after the condition check, but that it still does not guarantee it," I havn't change the callsite of the sleep function, it remains after the condition check in my latest revision. Could you please specify what additional steps you recommend to ensure the effectiveness of this change?

Correct, zsend.complete and zrecv.complete are monotonic in the sense that they only go from 0 to 1. Correct, the issue existed before your change, your change does improve the situation, vastly, but seems like an incomplete fix in that it works as expected most of the time, but it does not guarantee that the last status line written was based on counters in their final state after completion.

The corner case I'm (only a little) concerned about:

recv thread is still executing, zrecv.complete is 0.

monitor thread executes (new) lines 474 to 490, outputting a line to the status file based on non-completed status counters.

monitor thread gets preempted, or, say, file I/O takes unexpectedly long.

recv thread completes and sets zrecv.complete to 1.

monitor thread executes line 491, observes zrecv.complete being 1, breaks the loop. The last status line is still based on older counters before completion.

In practice, especially with default cooldown, not a big deal. Still, I would propose to read zsend.complete && zrecv.complete into a local var at the beginning of the loop, and break based on the local var on line 491, knowing that then, when we are complete, we were already complete before outputting status, which means status was based on counters after completion. If completion happens during production of the status line (during execution of lines 474 to 490), we take the loop once more, produce another full status line, then exit after a complete line.

Understood, I will make an effort to implement improvements based on your suggestions.

…ap#823, zmap#828)

WangYihang · 2024-03-18T01:15:54Z

src/monitor.c

-	export_status_t *export_status = xmalloc(sizeof(export_status_t));
-
-	while (!(zsend.complete && zrecv.complete)) {
+void export_then_update(int_status_t *internal_status, iterator_t *it, export_status_t *export_status, pthread_mutex_t *lock) {


I have now encapsulated the main functionality within the while loop (which involves obtaining the current scanning status and writing it to a status file) into a separate function called export_then_update. This encapsulation simplifies the logic in monitor_run.

The structure of the while loop remains unchanged. The only addition is to call export_then_update once more after the while loop concludes. This ensures that the status file is updated based on the latest state.

If there are any concerns regarding the style or functionality of the code modifications I've made, please let me know. Your feedback is greatly appreciated! @droe

This does address the race I pointed out. No other concerns, I'm not a reviewer tho.

phillip-stephens

This looks reasonable to me, really appreciate the contribution @WangYihang and the PR discussion @droe!

WangYihang added 2 commits March 16, 2024 10:18

fix: output the missing completed state into status output file (zmap…

7d0e23c

…#823)

fix: add missing header file stdbool.h (zmap#823)

332ab7a

droe reviewed Mar 16, 2024

View reviewed changes

Refactor monitor_run function and add export_then_update function （zm…

98050dd

…ap#823, zmap#828)

WangYihang commented Mar 18, 2024

View reviewed changes

phillip-stephens self-requested a review March 18, 2024 22:22

phillip-stephens approved these changes Mar 18, 2024

View reviewed changes

phillip-stephens merged commit 4d4166e into zmap:main Mar 18, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output the progress once more when the whole scan process finished #828

Output the progress once more when the whole scan process finished #828

WangYihang commented Mar 16, 2024

droe Mar 16, 2024 •

edited

WangYihang Mar 16, 2024 •

edited

droe Mar 16, 2024

WangYihang Mar 16, 2024 •

edited

droe Mar 16, 2024

WangYihang Mar 18, 2024

WangYihang Mar 18, 2024

droe Mar 18, 2024

phillip-stephens left a comment

Output the progress once more when the whole scan process finished #828

Output the progress once more when the whole scan process finished #828

Conversation

WangYihang commented Mar 16, 2024

Introduction

Overview of Current Issue

Proposed Enhancement

droe Mar 16, 2024 • edited

Choose a reason for hiding this comment

WangYihang Mar 16, 2024 • edited

Choose a reason for hiding this comment

droe Mar 16, 2024

Choose a reason for hiding this comment

WangYihang Mar 16, 2024 • edited

Choose a reason for hiding this comment

droe Mar 16, 2024

Choose a reason for hiding this comment

WangYihang Mar 18, 2024

Choose a reason for hiding this comment

WangYihang Mar 18, 2024

Choose a reason for hiding this comment

droe Mar 18, 2024

Choose a reason for hiding this comment

phillip-stephens left a comment

Choose a reason for hiding this comment

droe Mar 16, 2024 •

edited

WangYihang Mar 16, 2024 •

edited

WangYihang Mar 16, 2024 •

edited