perf: speed improvements #1441

stephanGarland · 2023-01-20T04:58:30Z

Summary

This is a draft for now. It pulls in the patch I linked here, as well as some other work. I mostly wanted to get people's opinion on it early before I continue with the entire project.

In general, it speeds up performance by reducing fork calls, either by combining various combinations of sed | grep | cut etc. into a single awk call, or where possible, using shell built-ins like parameter substitution and modification.

Fixes

It addresses but does not fix issue 290

Other Information

I haven't ran the full test suite yet, but I know at least one (where) fails; I believe this to be a false positive, and will address it.

…ands

…re possible

lib/utils.bash

stephanGarland · 2023-01-20T05:03:33Z

lib/utils.bash

@@ -190,7 +229,8 @@ find_versions() {
  version=$(get_version_from_env "$plugin_name")
  if [ -n "$version" ]; then
    local upcase_name
-    upcase_name=$(printf "%s\n" "$plugin_name" | tr '[:lower:]-' '[:upper:]_')
+    tr_args=("$plugin_name" "[:lower:]" "[:upper:]" "-" "_")
+    upcase_name=$(fast_tr ${tr_args[@]})


Due to the subshell this is at best equally as fast as before. Probably a way to do this without a subshell.

lib/utils.bash

jthegedus · 2023-01-20T06:07:06Z

@stephanGarland A few thoughts after a quick glance.

Before we introduce changes like these I would like to standardize and have in our CI pipelines performance testing and reporting to PRs. That way we identify the actual performance uplift of a specific change. (https://github.com/sharkdp/hyperfine is my current candidate for this automation. How to integrate it is something I will look at very soon).

Once we have the above, we can then go through and add each of these changes one at a time. That way we should have clear data about the improvement for each case.

Further, #1433 & #1138 are partially addressing #1396 & #1397 which will lead us down the path towards using more native shell features. I am not sure if changes for those will clash with what you have here, but be prepared for some rework here.

This draft PR is a great place for us to discuss your performance improvements and ideas, so please keep this open, keep updating it, and I will hopefully have time to review the rest of it soon.

Thanks! This is exciting work!

hyperupcall · 2023-01-22T09:08:37Z

Awesome! Performance increases are a much needed fix.

I know #1433 was mentioned, and as its author, I know that PR is pretty big, but I only fixed formatting (mostly), for only files in test/*, so there should be little to no conflicts. Same with the other PR.

I've identified some Bash incompatibilities, and some possible improvements. There might be some disagreement of some sort about the first suggestion, so maybe that one can be hashed out once the mentioned additional perf test is integrated in the CI pipeline

hyperupcall · 2023-01-22T09:08:58Z

lib/utils.bash

@@ -7,6 +7,45 @@ GREP_COLORS=

 ASDF_DIR=${ASDF_DIR:-''}
 ASDF_DATA_DIR=${ASDF_DATA_DIR:-''}
+BASH_MAJOR_VERSION=${BASH_VERSION%%.*}
+
+fast_basename() {


This can be changed to:?

local path="$1" path=${path%/} path=${path##*/} printf '%s\n' "$REPLY"

The extra tests seem redundant as Bash should perform them internally.

I think it's very worth pursuing having a REPLY-like pattern as shown here to completely eliminate the subshell (I strongly prefer the variable being called REPLY, though). In my experience this gives gigantic perf wins, something similar as what it seems you have alluded to in your other comment. I'm not sure how on-board the core-team is with this, though, which is why I think more discussion is needed.

Agreed that the additional subshells can be removed, ty for this recommendation.

Re: $REPLY, I thought that was specifically for read? Using your code sample as-is resulted in an empty output for me.

Yeah it is for read and also select, I'm borrowing the convention (more info here). It will work if the function looks something like this:

fast_basename() { unset -v REPLY; REPLY= local path="$1" path=${path%/} path=${path##*/} REPLY=$path }

And is called like this:

fast_basename "$executable_path" local cmd="$REPLY"

The code sample I provided was for existing subshell code, not the REPLY-thing.

EDIT: Ah, there are times like this that would likely be sped up with the elimination of the subshell.

I ran the two fast_basename types (REPLY and printf) with hyperfine on my M1 Air. I also tried this on a Debian server, but generally was unable to get hyperfine to not issue warnings about interference, no matter the combinations tried. Regardless, the timings were always incredibly close. In this example, printf had a slightly higher max, but was otherwise identical.

❯ cat fast_base*.sh #!/usr/bin/env bash fast_basename() { unset dirpath local dirpath="$1" dirpath=${dirpath%/} dirpath=${dirpath##*/} printf "%s\n" $dirpath } fast_basename "$1" #!/usr/bin/env bash fast_basename() { unset -v REPLY; REPLY= local path="$1" path=${path%/} path=${path##*/} REPLY=$path } fast_basename "$1" ❯ for f in fast_*.sh; do hyperfine -N --warmup 1000 --runs 1000 "./$f $PWD"; done Benchmark 1: ./fast_basename_printf.sh /Users/sgarland Time (mean ± σ): 2.7 ms ± 0.4 ms [User: 1.3 ms, System: 0.9 ms] Range (min … max): 2.5 ms … 5.2 ms 1000 runs Warning: The first benchmarking run for this command was significantly slower than the rest (3.4 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run. Benchmark 1: ./fast_basename_reply.sh /Users/sgarland Time (mean ± σ): 2.7 ms ± 0.4 ms [User: 1.3 ms, System: 0.9 ms] Range (min … max): 2.5 ms … 5.1 ms 1000 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Re: the best practices linked, I don't understand the reasoning for some of them.

unset REPLY; REPLY=

If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.

unset i

If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:

function foo() { echo $i local i= for _ in {0..9}; do ((i=i+1)) done echo $i }

In any case, much of this is quibbling over minute differences, and I'm not that much of a stickler. I care about improving the core performance, and it's clear that you do as well.

For perf test, you have to include the capture of the variable as a part of the test. So I'm not sure what you're trying to measure when you use printf without a subshell. These were my results:

script.sh

#!/usr/bin/env bash fast_basename_work() { local path="$1" path=${path%/} printf '%s\n' "{path##*/}" } slow_basename() { unset -v REPLY REPLY=$(basename "$1") } fast_basename() { unset -v REPLY REPLY=$(fast_basename_work "$1") } faster_basename() { unset -v REPLY; REPLY= local path="$1" path=${path%/} path=${path##*/} REPLY=$path } main() { "$1" "$PWD" local _result="$REPLY" } main "$@"

> for fn in slow_basename fast_basename faster_basename; do hyperfine -N --warmup 1000 --runs 1000 "./script.sh $fn"; done Benchmark 1: ./script.sh slow_basename Time (mean ± σ): 3.6 ms ± 0.2 ms [User: 2.0 ms, System: 1.4 ms] Range (min … max): 3.4 ms … 8.8 ms 1000 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Benchmark 1: ./script.sh fast_basename Time (mean ± σ): 3.2 ms ± 0.4 ms [User: 1.6 ms, System: 1.4 ms] Range (min … max): 2.9 ms … 15.1 ms 1000 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Benchmark 1: ./script.sh faster_basename Time (mean ± σ): 2.7 ms ± 0.3 ms [User: 1.3 ms, System: 1.2 ms] Range (min … max): 2.5 ms … 8.9 ms 1000 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

For your questions about my self-defined "best practices"

If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.

The REPLY= is to be absolutely certain that REPLY is what you think it is. For a single function, this doesn't do anything, but if you have hundreds of functions that set REPLY, this ensures that artifacts of running those other functions don't accidentally interfere with the current function.

The unset -v is necessary because we want to be absolutely sure that the variable is of the correct type - it's fundamentally more correct. See below:

script.sh

# shellcheck shell=bash fn1() { unset -v REPLY REPLY= printf 'fn1: %s\n' "$(declare -p REPLY)" REPLY=(a b) } fn2() { # unset -v REPLY REPLY= printf 'fn2: %s\n' "$(declare -p REPLY)" REPLY=c } main() { fn1 local -a result1="${REPLY[@]}" fn2 local result2="$REPLY" } main "$@"

And I can't just do unset -v REPLY because for some reason, people like to enable errunset, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.

If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:

Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so:

otherfn() { local i= for i in {0..1}; do :; done foo }

So by doing unset -v i, the goal is to have a behavior closer to traditionally scoped variables. Bash variables (with local) are function scoped, but more importantly, they are also dynamically scoped, and I wish to account for that.

But yeah, these are minor differences, and I don't wish to prescribe all of these conventions on the ASDF codebase - just the one that actually significantly improves performance.

So I'm not sure what you're trying to measure when you use printf without a subshell.

In the light of day, I also have no idea what I was doing there. Good catch.

And I can't just do unset -v REPLY because for some reason, people like to enable errunset, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.

An excellent point I hadn't thought of. People love to set -euo pipefail because someone said it was better.

Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so

Again, an excellent point I hadn't thought of.

I think I'm just going to defer to your recommendations. I thought I knew shell reasonably well, but clearly I've a lot more to learn. Thank you for taking the time to demonstrate your answers with scripts.

hyperupcall · 2023-01-22T09:09:11Z

lib/utils.bash

+  # if it doesn't, use parameter substitution on the path as-is
+  dirpath="$1"
+  [[ "$dirpath" == */ ]] && dirpath="${dirpath%/}" && dirpath="${dirpath##*/}" || dirpath="${dirpath##*/}"
+  printf "$dirpath"


This, and other invocations to printf should have %s\n as a first parameter to prevent unintentional formatting.

EDIT: I should have said %s\n since that makes debugging/developing these functions easier - and subshells, process substitution, herestrings automatically append newlines anywyas.

hyperupcall · 2023-01-22T09:09:15Z

lib/utils.bash

+        outputStr="${inputStr@U}"
+        ;;
+      "[:upper:]")
+        outputStr="${inputStr@L}"


This breaks since @U and @L were added in bash 5.0 while ,, and ^^ were added in bash 4.0. But the check is still good in principle because we support 3.2 at a minimum.

Ah, good catch. I originally had the ^^ syntax but then found out about @U and used it since it seemed more intuitive. I didn't check the version requirements for it, though.

…5-ism

stephanGarland · 2023-01-25T04:19:01Z

lib/utils.bash

-  # exec_paths contains a trailing newline which is converted to a colon, so no
-  # colon is needed between the subshell and the PATH variable in this string
-  path="$(tr '\n' ':' <<<"$exec_paths")$PATH"
+  path="${exec_paths}:$PATH"


I didn't find the comment to be true in testing, and this works fine as-is. Open to proof to the contrary.

parse_asdf_version_file

stephanGarland · 2023-01-25T04:23:30Z

lib/utils.bash

@@ -892,7 +888,9 @@ remove_path_from_path() {
  # Output is a new string suitable for assignment to PATH
  local PATH=$1
  local path=$2
-  substitute "$PATH" "$path" "" | sed -e "s|::|:|g"
+  PATH=${PATH//"$path"}


There is an argument to be made for using the substitute call everywhere to make it more easily understandable, but at the very least, here the sed pipe is unnecessary and this can simply be modified twice.

lib/utils.bash

Stratus3D · 2023-01-27T18:18:30Z

Before we introduce changes like these I would like to standardize and have in our CI pipelines performance testing and reporting to PRs.

For the record I agree with @jthegedus on this. I'd love to improve performance but I also don't want to be reviewing changes that only improve performance by a small percentage, or changes that only improve things for a small subset of users (not saying that is the case here, I'm just now looking at these changes).

Stratus3D

The new functions appear ok, but whether or not these changes will be accepted depends on how these changes improve the performance of asdf commands. The performance of these particular functions on their own is less important than their affect on the performance of the whole codebase.

jthegedus · 2023-01-31T12:28:30Z

Once I finish my other PRs I will work on the performance automation for this repo.

…dding a leading space

stephanGarland · 2023-02-10T17:26:29Z

I found and fixed a bug I had introduced with parse_asdf_version_file that was causing a ton of failures. There are still test failures (34/352, but tbf some of those are from fish and elvish simply because I don't have them installed yet), but the number is greatly reduced.

asdf-vm/asdf#1441 lol shame

ssbarnea · 2024-01-21T21:17:44Z

Happy 1 year anniversary for this PR! Sic.

stephanGarland · 2024-04-28T16:17:19Z

Happy 1 year anniversary for this PR! Sic.

I got frustrated trying to fix bugs only present in esoteric shells like elvish, and around the same time, shifted to mise (née rtx). Sorry. I might pick it up again once my professional life has calmed down some.

stephanGarland added 2 commits January 15, 2023 17:28

feat: first pass at improving speed, mostly for list and current comm…

417c122

…ands

feat: added a fast_tr command, did some more grouping of commands whe…

009ab6e

…re possible

stephanGarland requested a review from a team as a code owner January 20, 2023 04:58

stephanGarland marked this pull request as draft January 20, 2023 04:58

stephanGarland changed the title ~~Draft: speed improvements~~ perf: speed improvements Jan 20, 2023

stephanGarland commented Jan 20, 2023

View reviewed changes

lib/utils.bash Show resolved Hide resolved

stephanGarland commented Jan 20, 2023

View reviewed changes

lib/utils.bash Show resolved Hide resolved

hyperupcall suggested changes Jan 22, 2023

View reviewed changes

stephanGarland mentioned this pull request Jan 24, 2023

asdf shims adding too much time to use them in a shell prompt #290

Open

stephanGarland added 3 commits January 24, 2023 19:36

fix: using native bash versinfo array to check versions; fixed a bash…

f410204

…5-ism

feat: changed fast_basename to use REPLY

f585713

feat: changed fast_tr to use REPLY; eliminated an unnecessary tr call

0084332

stephanGarland commented Jan 25, 2023

View reviewed changes

feat: minor changes to two functions; added comment for

3d86e29

parse_asdf_version_file

stephanGarland force-pushed the develop branch from 0cf3c3e to 3d86e29 Compare January 25, 2023 04:21

stephanGarland commented Jan 25, 2023

View reviewed changes

lib/utils.bash Outdated Show resolved Hide resolved

Stratus3D reviewed Jan 31, 2023

View reviewed changes

stephanGarland added 2 commits February 10, 2023 12:23

fix: identified and fixed bug with parse_asdf_version_file that was a…

1a6d8b9

…dding a leading space

chore: merge branch 'master' into develop

08beda0

hyperupcall mentioned this pull request Jun 15, 2023

Why no subshells? bash-bastion/bash-algo#1

Closed

Riatre added a commit to Riatre/dotfilez that referenced this pull request Jul 26, 2023

zsh: switch from asdf to rtx

8177003

asdf-vm/asdf#1441 lol shame

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: speed improvements #1441

perf: speed improvements #1441

stephanGarland commented Jan 20, 2023

stephanGarland Jan 20, 2023

jthegedus commented Jan 20, 2023 •

edited

hyperupcall commented Jan 22, 2023 •

edited

hyperupcall Jan 22, 2023 •

edited

stephanGarland Jan 22, 2023

hyperupcall Jan 22, 2023

stephanGarland Jan 25, 2023 •

edited

hyperupcall Jan 25, 2023 •

edited

stephanGarland Jan 25, 2023

hyperupcall Jan 22, 2023 •

edited

hyperupcall Jan 22, 2023 •

edited

stephanGarland Jan 22, 2023

stephanGarland Jan 25, 2023

stephanGarland Jan 25, 2023

Stratus3D commented Jan 27, 2023

Stratus3D left a comment

jthegedus commented Jan 31, 2023

stephanGarland commented Feb 10, 2023

ssbarnea commented Jan 21, 2024

stephanGarland commented Apr 28, 2024

perf: speed improvements #1441

Are you sure you want to change the base?

perf: speed improvements #1441

Conversation

stephanGarland commented Jan 20, 2023

Summary

Fixes

Other Information

stephanGarland Jan 20, 2023

Choose a reason for hiding this comment

jthegedus commented Jan 20, 2023 • edited

hyperupcall commented Jan 22, 2023 • edited

hyperupcall Jan 22, 2023 • edited

Choose a reason for hiding this comment

stephanGarland Jan 22, 2023

Choose a reason for hiding this comment

hyperupcall Jan 22, 2023

Choose a reason for hiding this comment

stephanGarland Jan 25, 2023 • edited

Choose a reason for hiding this comment

hyperupcall Jan 25, 2023 • edited

Choose a reason for hiding this comment

stephanGarland Jan 25, 2023

Choose a reason for hiding this comment

hyperupcall Jan 22, 2023 • edited

Choose a reason for hiding this comment

hyperupcall Jan 22, 2023 • edited

Choose a reason for hiding this comment

stephanGarland Jan 22, 2023

Choose a reason for hiding this comment

stephanGarland Jan 25, 2023

Choose a reason for hiding this comment

stephanGarland Jan 25, 2023

Choose a reason for hiding this comment

Stratus3D commented Jan 27, 2023

Stratus3D left a comment

Choose a reason for hiding this comment

jthegedus commented Jan 31, 2023

stephanGarland commented Feb 10, 2023

ssbarnea commented Jan 21, 2024

stephanGarland commented Apr 28, 2024

jthegedus commented Jan 20, 2023 •

edited

hyperupcall commented Jan 22, 2023 •

edited

hyperupcall Jan 22, 2023 •

edited

stephanGarland Jan 25, 2023 •

edited

hyperupcall Jan 25, 2023 •

edited

hyperupcall Jan 22, 2023 •

edited

hyperupcall Jan 22, 2023 •

edited