perf: optimize performance of waybar cpuinfo and gpuinfo modules #952

mislah · 2024-02-24T09:50:57Z

Icon generation bottleneck

The function to generate icons have been causing performance bottleneck since the beginning of cpuinfo module. The original version (3cf5d29, c.sh in the screenshot) has been taking around 30.5ms (not absolute, for comparison).

To reduce the performance overhead, the revision (9b9b1ce, d.sh in the screenshot) have improved the speed to 22.3ms (37% faster than the original version). Here a single jq has been used to replace multiple forks of awk to improve the performance.

The original version turns out to run on 6.58ms (364% faster than original and 238% faster than the revised) after minimizing the number of forks and removing all forks to awk (b.sh in the screenshot). I also tested possibly the minimal version a.sh which contains only if branches completed in 5.4ms.

Despite a.sh is the fastest, I included b.sh in both gpuinfo and cpuinfo as it follows Object Oriented principles and performance difference is insignificant (1.18ms).

The forks

There are a number of redundant forks throughout script like:
maxfreq=$(lscpu | grep "CPU max MHz" | awk -F: '{ print $2}' | sed -e 's/ //g' -e 's/\.[0-9]*//g')

Which could be improved by substituting with:
maxfreq=$(lscpu | awk '/CPU max MHz/ { sub(/\..*/,"",$4); print $4}')

The number of forks have been minimized completely on cpuinfo.sh and partially on gpuinfo.sh.

`top` bottleneck

The overall performance of the script wasn't improved significantly even after minimizing forks as the top -bn1 taking almost 260ms as it has to compute information of all running and sleeping processes. The utility mpstat is a great alternative but it requires the program to be installed on the system. Hence top is replaced by calculating the utilization levels directly from /proc/stat which is taking around 15ms, that is 1633% improvement in performance.

Continuous script

As the script has to recalculate values that doesn't change during runtime (like CPU name and max freq), it is inefficient to run on every 5s. Since waybar allows the scripts to be executed in continuous mode which allows significant improvement on performance as the static values doesn't required to be recalculated on each iteration, the cpuinfo.sh has been adjusted to run as a continuous script.

Other changes

Improved accuracy of utilization level as it now prints average between each updates. Also added option to decide whether to print emojis or glyphs by setting an environment variable NO_EMOJI as the commit: 9b9b1ce converted emojis to glyphs.

Warning to reviewers

The changes included in the commit: a5cfdd0 have been included in this PR, but haven't tested as my machine doesn't have AMD CPU. Although, I tried testing with passing faked sensors data and worked well.

Testing

All the unit testing has been done by the script:

python -m timeit -n 10 -s 'import subprocess' "subprocess.call('./test.sh', shell=True)"

The python timeit is not absolute but is highly precise and useful for comparison. I think strace or time gives more accurate results but are not statistical or precise.

Type of change

Performance improvement (non-breaking change)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update (non-breaking change; modified files are limited to the documentations)
Technical debt (a code change that does not fix a bug or add a feature but makes something clearer for devs)

Checklist

Optimize cpuinfo to use less resources by minimizing the number of forks as it is called once in every 5 seconds. Revert to the previous icon generation method for faster performance compared to jq after reducing forks.

Revise icon generation method to minimize forks, particularly resource intensive calls to awk as the script runs once in every 5 seconds.

Convert cpuinfo into a continuous script to avoid redundant calculations of constant values. Calculate utilization from /proc/stat as 'top' has been causing the performance bottleneck. Improve accuracy by averaging utilization over a 5 second interval instead of using instantaneous values.

Print glyphs instead of emojis if NO_EMOJI env var is set to 1. Reduce the number of function calls and redeclaration of variables.

Print glyphs instead of emojis if NO_EMOJI env var is set to 1 in gpuinfo module.

kRHYME7 · 2024-02-24T10:38:20Z


❯ ~/.config/hypr/scripts/gpuinfo.sh
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
/home/khing/.config/hypr/scripts/gpuinfo.sh: line 127: [: -gt: unary operator expected
{"text":" 61°C", "tooltip":"Intel CometLake-U GT2\n Temperature: 61°C ❄️\n Power Discharge: 15.203 W\n󰾆 Utilization: 27.6 %\n Clock Speed: 1799.66/4900 MHz"}

Disclaimer: I did not fully review it yet. Looks good in the description, so I won't dig deeper if this works on my end. 😁

Additionally, I believe you can pass the $NO_EMOJI variable to the query() function to avoid making major changes to the JSONC. In the query() function, there's a file ${gpuQ} which lists all variables on the script's first execution. You can append the $NO_EMOJI variable to this list, and it might be invoked by running the script once at startup, like so:

${0} --emoji

My sys info if needed.

Hyprland, built from branch main at commit f27054c13e72598e43771cb3f8bfad8ac6c6668f dirty (flake.nix: override inputs for xdph and hyprlang).
Date: Fri Feb 23 23:48:11 2024
Tag: v0.35.0-88-gf27054c1

flags: (if any)


System Information:
System name: Linux
Node name: ArchLinux
Release: 6.7.6-zen1-1-zen
Version: #1 ZEN SMP PREEMPT_DYNAMIC Fri, 23 Feb 2024 16:32:48 +0000


GPU information: 
00:02.0 VGA compatible controller [0300]: Intel Corporation CometLake-U GT2 [UHD Graphics] [8086:9b41] (rev 02) (prog-if 00 [VGA controller])


os-release: NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://gitlab.archlinux.org/groups/archlinux/-/issues"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo


plugins:
  hyprgrass by horriblename ver 0.5
  hycov by DreamMaoMao ver 0.3
  hyprwinwrap by Vaxry ver 1.0

mislah · 2024-02-24T10:52:19Z

Oh no, The same happening on my machine too O_O

Demn, I forgot to pass the temperature and utilization value to the function.
Wait fixing..

mislah · 2024-02-24T11:11:19Z

Additionally, I believe you can pass the $NO_EMOJI variable to the query() function to avoid making major changes to the JSONC. In the query() function, there's a file ${gpuQ} which lists all variables on the script's first execution. You can append the $NO_EMOJI variable to this list, and it might be invoked by running the script once at startup

I thought of adding NO_EMOJI as a env var to make it easier for end users to switch between emoji and glyphs as they can easily see those in waybar module templates. I don't think anyone will notice this option if we add the variable in gpuinfo.sh as the script is already scaryyy.

Fix the bug causing the unexpected behavior due to passing wrong arguments to the map_floor function.

kRHYME7 · 2024-02-24T11:22:16Z

I meant like this.
You can add a flag for no_emoji.

As having an ENV is too specific and overkill for just a single script haha

so example command might look like this and can be invoke at startup

exec-once = ~/.config/hypr/scripts/gpuinfo.sh --use intel startup no_emoji

Sorry for making that slightly scary, it's the only way I can think of catering all 3 GPUs.

mislah · 2024-02-24T11:39:05Z

Umm.. I see. It looks great but feels a bit complicated to me . Can you add that to the script if you don't mind or I can try later today.

Sorry for making that slightly scary, it's the only way I can think of catering all 3 GPUs.

No no, don't get me wrong, I was saying shell scripts are scary as a whole.

I fixed the bug you pointed out btw.. please check again.

kRHYME7 · 2024-02-24T11:47:10Z

~~EDITED~~
Is that fine ?
8ff0036

Also we have the --reset so we can override/change the "variables"

Should look like this:

No no, don't get me wrong, I was saying shell scripts are scary as a whole.

Ow, I was afraid this version is not maintainable, but I'm wrong. You guys are Pros. Everything works on my end, BTW. for now

mislah · 2024-02-24T13:23:40Z

Excellent! Thank you, you saved a lot of my time ❤️

Ow, I was afraid this version is not maintainable, but I'm wrong.

It's true that it is a bit hard to understand (I didn't read most of it) but it's amazing how fast it runs. Initially the cpuinfo took around 320ms but your script runs at around 22ms on my machine. Pure genius.

prasanthrangan · 2024-02-25T06:37:47Z

Configs/.config/hypr/scripts/cpuinfo.sh

+prevStat=$(awk '{print $2+$3+$4+$6+$7+$8 }' <<< $statFile)
+prevIdle=$(awk '{print $5 }' <<< $statFile)
+
+while true; do


why the loop? its not recommended to run this as a daemon process.

Continuous script

As the script has to recalculate values that doesn't change during runtime (like CPU name and max freq), it is inefficient to run on every 5s. Since waybar allows the scripts to be executed in continuous mode which allows significant improvement on performance as the static values doesn't required to be recalculated on each iteration, the cpuinfo.sh has been adjusted to run as a continuous script.

It can be made to execute once in every 5s instead by saving the prev statistics to a temp file. However, I don't see why it is not recommended to run it as a daemon as re-executing the script has to deal with unnecessary system calls, read-writes and re-interpreting the script once in every 5s.

Also, it is not going to save cpu time on the scheduler as it has to deal with the sleep syscall from the script otherwise the same from the waybar.

It is also noted that the waybar builtin modules which requires updates on periodic intervals (memory, cpu_usage) are running on separate daemon threads.

Continuous script

As the script has to recalculate values that doesn't change during runtime (like CPU name and max freq), it is inefficient to run on every 5s. Since waybar allows the scripts to be executed in continuous mode which allows significant improvement on performance as the static values doesn't required to be recalculated on each iteration, the cpuinfo.sh has been adjusted to run as a continuous script.

It can be made to execute once in every 5s instead by saving the prev statistics to a temp file. However, I don't see why it is not recommended to run it as a daemon as re-executing the script has to deal with unnecessary system calls, read-writes and re-interpreting the script once in every 5s.

Oh my bad I missed it, you did mention it, but it was too long so I didn't read :(
But still my concern is valid... I do not use the any of these monitoring modules in waybar as i often switch the waybar mode... So my question is that, will this process still run when the waybar mode changes?

As per the confgen.sh architecture, the waybar configuration file config.jsonc is regenerated on switching the waybar mode. So this process won't be summoned by new instances of waybar as the modules will not be present in the config.jsonc.

hyprdots/Configs/.local/share/bin/wbarconfgen.sh

Lines 125 to 130 in 8369bed

# restart waybar

if [ "$reload_flag" == "1" ] ; then

killall waybar

waybar > /dev/null 2>&1 &

fi

According the above snippet, when you switch the mode, the waybar and its entire process tree will be killed, including our poor little daemon.

Sorry for the late reply, was little too busy these days.

kRHYME7 · 2024-03-01T16:20:15Z

If remembered correctly, we can just use ~/.config/hypr/scripts/gpuinfo.sh to get intel and amd cpu temps too. The script uses sensors. ( I only realized recently that sensors provides JSON we can make it more efficient).

to use the gpuinfo script, we can just call it ~/.config/hypr/scripts/gpuinfo.sh --use intel

For now, we can use this. If we want to query CPU, I can help to add static variables on the already availabe metadata file.
These are generated by the gpuinfo script to query first.

This is the reason why the gpuinfo.jsonc have some custom waybar modules available.

For now ~/.config/hypr/scripts/gpuinfo.sh can handle to query for cpu as long as it has an iGPU. ( a single unit like some intel and AMD)

But if there are devices that CPU and GPU are separated, we should have a separate CPU script or we can modify the gpuinfo.

prasanthrangan · 2024-03-02T12:02:20Z

If remembered correctly, we can just use ~/.config/hypr/scripts/gpuinfo.sh to get intel and amd cpu temps too. The script uses sensors. ( I only realized recently that sensors provides JSON we can make it more efficient).

Oh cool idea, why cant we just show gpu and cpu info in the same module? the script output could be both GPU and CPU temps combined.

kRHYME7 · 2024-03-12T00:42:22Z

Oh cool idea, why cant we just show gpu and cpu info in the same module? the script output could be both GPU and CPU temps combined.

We can, but should we consider integrated GPU temp as the CPU temp too?

I'll try to merge cpuinfo and gpuinfo into a single script, so they should share a single metadata. Then use options to it so that we can still get flexibility

[REF] snippet at the bottom we can change --use into --gpu and --cpu

Example syntax With this we can have more options to what to show.

+  $0 --cpu [CPU] --gpu [GPU]   # OUTPUT: "text":" 60°C  55°C" ... 
+  $0 --gpu [GPU]               # OUTPUT: "text":" 55°C " ...
+  $0 --cpu                     # OUTPUT: "text":" 60°C " ...

Of course the tooltip should provide both CPU and GPU temp

❯  ~/.config/hyprdots/scripts/gpuinfo.sh -h

Avalable GPU: intel 
[options]
--toggle         * Toggle available GPU
--use [GPU]      * Only call the specified GPU (Useful for adding specific GPU on waybar)
--reset          *  Remove & restart all query

[flags]
tired            * Adding this option will not query nvidia-smi if gpu is in suspend mode 
startup          * Useful if you want a certain GPU to be set at startup

* If khing declared env = WLR_DRM_DEVICES on hyprland then use this as the primary GPU

Of course, I need help from @mislah to refactor the functions.

ffoebel · 2024-03-16T12:36:24Z

Configs/.config/hypr/scripts/cpuinfo.sh

+util_lv="90:, 60:󰓅, 30:󰾅, 󰾆" 
+
+# Get static CPU information
+model=$(lscpu | awk -F': ' '/Model name/ {gsub(/^ *| *$| CPU.*/,"",$2); print $2}')


lscpu respects LANG and thus requires en_us for this to work.

Great, good find. I'll add this on the next commit.

mislah · 2024-04-15T12:25:37Z

We can, but should we consider integrated GPU temp as the CPU temp too?

I'll try to merge cpuinfo and gpuinfo into a single script, so they should share a single metadata. Then use options to it so that we can still get flexibility
.....
Of course, I need help from @mislah to refactor the functions.

Excellent, I also thought the same, as they are both quite similar. I'm a little too busy these days. I assure you that I'll help as much as I can, but I'm afraid I might not have enough time.

kRHYME7 · 2024-04-15T12:35:30Z

No rush on this one. Everything is working as intended for now. Will try to prioritize other issues.

I will add a PR @ https://github.com/mislah/hyprdots/tree/waybar/optimize soon.

mislah added 5 commits February 23, 2024 22:08

perf: optimize waybar cpuinfo

74d64af

Optimize cpuinfo to use less resources by minimizing the number of forks as it is called once in every 5 seconds. Revert to the previous icon generation method for faster performance compared to jq after reducing forks.

perf: optimize waybar gpuinfo

bc4cca3

Revise icon generation method to minimize forks, particularly resource intensive calls to awk as the script runs once in every 5 seconds.

fix: add option to print glyphs instead of emojis

bd0a4d9

Print glyphs instead of emojis if NO_EMOJI env var is set to 1. Reduce the number of function calls and redeclaration of variables.

fix: add option to print glyphs instead of emojis (gpuinfo)

17d7cd3

Print glyphs instead of emojis if NO_EMOJI env var is set to 1 in gpuinfo module.

fix: bug causing unexpected behavior

24c4796

Fix the bug causing the unexpected behavior due to passing wrong arguments to the map_floor function.

This comment was marked as spam.

Sign in to view

Update gpuinfo.sh

8ff0036

prasanthrangan reviewed Feb 25, 2024

View reviewed changes

mislah requested a review from prasanthrangan February 27, 2024 05:33

ffoebel reviewed Mar 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize performance of waybar cpuinfo and gpuinfo modules #952

perf: optimize performance of waybar cpuinfo and gpuinfo modules #952

mislah commented Feb 24, 2024 •

edited

kRHYME7 commented Feb 24, 2024 •

edited

mislah commented Feb 24, 2024

mislah commented Feb 24, 2024 •

edited

kRHYME7 commented Feb 24, 2024 •

edited

mislah commented Feb 24, 2024

kRHYME7 commented Feb 24, 2024 •

edited

This comment was marked as spam.

mislah commented Feb 24, 2024

prasanthrangan Feb 25, 2024

mislah Feb 25, 2024 •

edited

prasanthrangan Mar 2, 2024

mislah Apr 15, 2024

kRHYME7 commented Mar 1, 2024

prasanthrangan commented Mar 2, 2024

kRHYME7 commented Mar 12, 2024 •

edited

ffoebel Mar 16, 2024

mislah Apr 15, 2024

mislah commented Apr 15, 2024

kRHYME7 commented Apr 15, 2024

	# restart waybar

	if [ "$reload_flag" == "1" ] ; then
	killall waybar
	waybar > /dev/null 2>&1 &
	fi

perf: optimize performance of waybar cpuinfo and gpuinfo modules #952

Are you sure you want to change the base?

perf: optimize performance of waybar cpuinfo and gpuinfo modules #952

Conversation

mislah commented Feb 24, 2024 • edited

Icon generation bottleneck

The forks

top bottleneck

Continuous script

Other changes

Warning to reviewers

Testing

Type of change

Checklist

kRHYME7 commented Feb 24, 2024 • edited

My sys info if needed.

mislah commented Feb 24, 2024

mislah commented Feb 24, 2024 • edited

kRHYME7 commented Feb 24, 2024 • edited

mislah commented Feb 24, 2024

kRHYME7 commented Feb 24, 2024 • edited

This comment was marked as spam.

mislah commented Feb 24, 2024

prasanthrangan Feb 25, 2024

Choose a reason for hiding this comment

mislah Feb 25, 2024 • edited

Choose a reason for hiding this comment

Continuous script

prasanthrangan Mar 2, 2024

Choose a reason for hiding this comment

Continuous script

mislah Apr 15, 2024

Choose a reason for hiding this comment

kRHYME7 commented Mar 1, 2024

prasanthrangan commented Mar 2, 2024

kRHYME7 commented Mar 12, 2024 • edited

ffoebel Mar 16, 2024

Choose a reason for hiding this comment

mislah Apr 15, 2024

Choose a reason for hiding this comment

mislah commented Apr 15, 2024

kRHYME7 commented Apr 15, 2024

mislah commented Feb 24, 2024 •

edited

`top` bottleneck

kRHYME7 commented Feb 24, 2024 •

edited

mislah commented Feb 24, 2024 •

edited

kRHYME7 commented Feb 24, 2024 •

edited

kRHYME7 commented Feb 24, 2024 •

edited

mislah Feb 25, 2024 •

edited

kRHYME7 commented Mar 12, 2024 •

edited