Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop_vmstat doesn't stop vmstat #2

Open
davemq opened this issue Apr 25, 2024 · 1 comment
Open

stop_vmstat doesn't stop vmstat #2

davemq opened this issue Apr 25, 2024 · 1 comment

Comments

@davemq
Copy link

davemq commented Apr 25, 2024

I ran

./lpcpu.sh duration=30

on an IBM Power 10 system running RHEL 8.9. The tool appears to hang after the message about finishing time:

Running Linux Performance Customer Profiler Utility version 356c8306d2c85f6af89dc2b85c151f4bbd9e9c63 2018-07-25 16:45:06 -0500                                                               
Importing CLI variable : duration=30

Starting Time: Thu Apr 25 14:53:51 EDT 2024
Setting up sar.
Setting up iostat.
Setting up mpstat.
Setting up vmstat.
Setting up lparstat.
Setting up top.
Setting up meminfo.
Setting up proc-interrupts.
Setting up cpupower.
Profilers start at: Thu Apr 25 14:53:52 EDT 2024
Starting sar.default [5]
Starting iostat.default [5] [mode=disks]
Starting mpstat.default [5]
Starting vmstat.default [5]
Starting lparstat.default [5]
Starting top.
Starting meminfo.default [5]
Starting proc-interrupts.default [5]
Starting cpupower.
Waiting for 30 seconds.
Stopping sar.
Stopping iostat.
Stopping mpstat.
Stopping vmstat.
Stopping lparstat.
Stopping top.
Stopping meminfo.
Stopping interrupts.
Stopping cpupower.
Profilers stop at: Thu Apr 25 14:54:22 EDT 2024
Processing sar data.
Processing iostat data.
Processing mpstat data.
Processing vmstat data.
Processing lparstat data.
Processing top data.
Processing meminfo data.
Processing interrupts data.
/home/davemarq/lpcpu/./lpcpu.sh: line 1233: report_cpupower: command not found
Setting up postprocess.sh
/home/davemarq/lpcpu/./lpcpu.sh: line 1257: setup_postprocess_lparstat: command not found
/home/davemarq/lpcpu/./lpcpu.sh: line 1257: setup_postprocess_cpupower: command not found
Gathering system information
Finishing time: Thu Apr 25 14:56:41 EDT 2024                                                   

I used ps to see what was going on:

    PID TTY      STAT   TIME COMMAND                                                          
   8894 pts/2    Ss     0:00 -bash
  20988 pts/2    S      0:00  \_ sudo -s                                                      
  20997 pts/2    S      0:00      \_ /bin/bash
 143942 pts/2    S+     0:00          \_ /bin/bash /home/davemarq/lpcpu/./lpcpu.sh duration=30
 143994 pts/2    S+     0:00              \_ tee -i /tmp/lpcpu_data.p181n201.default.2024-04-2
 144208 pts/2    S+     0:00 vmstat 5

The vmstat there seems odd, since there was a message about stopping vmstat above. The tee command is at line 1452 in lpcpu.sh:

} 2>&1 | tee -i ${LOGDIR}/lpcpu.out

My suspicion is that vmstat is holding the pipe to tee open. So why didn't vmstat stop? Here's stop_vmstat():

function stop_vmstat() {
	echo "Stopping vmstat."
	kill $VMSTAT_PID
}

VMSTAT_PID is set by start_vmstat():

function start_vmstat() {
	echo "Starting vmstat."$id" ["$interval"]" | tee -a $LOGDIR/profile-log.$RUN_NUMBER
	vmstat $interval | ${LPCPUDIR}/tools/output-timestamp.pl > $LOGDIR/vmstat.$id.$RUN_NUMBER &
	VMSTAT_PID=$!
	disown $VMSTAT_PID
}

If you read about $! in the bash manual page, it says

   !      Expands  to  the process ID of the job most recently placed into
         the background, whether executed as an asynchronous  command  or
         using the bg builtin (see JOB CONTROL below).

But which PID is that, the PID of vmstat or the output_timestamp.pl? It appears to be undefined. A more reliable way of stopping the vmstat pipeline is needed.

I fixed this temporarily by changing stop_vmstat() to use pkill vmstat. However, another way is to use a "jobspec" to specify the job to be killed. In this case, you could set VMSTAT_PID="%vmstat". Also change disown $VMSTAT_PID to disown $! and move it before setting VMSTAT_PID.

@davemq
Copy link
Author

davemq commented Apr 25, 2024

My idea of using a jobspec didn't work when I tried it. Going back to using pkill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant