New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: speed improvements #1441
base: master
Are you sure you want to change the base?
perf: speed improvements #1441
Conversation
lib/utils.bash
Outdated
@@ -190,7 +229,8 @@ find_versions() { | |||
version=$(get_version_from_env "$plugin_name") | |||
if [ -n "$version" ]; then | |||
local upcase_name | |||
upcase_name=$(printf "%s\n" "$plugin_name" | tr '[:lower:]-' '[:upper:]_') | |||
tr_args=("$plugin_name" "[:lower:]" "[:upper:]" "-" "_") | |||
upcase_name=$(fast_tr ${tr_args[@]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the subshell this is at best equally as fast as before. Probably a way to do this without a subshell.
@stephanGarland A few thoughts after a quick glance. Before we introduce changes like these I would like to standardize and have in our CI pipelines performance testing and reporting to PRs. That way we identify the actual performance uplift of a specific change. (https://github.com/sharkdp/hyperfine is my current candidate for this automation. How to integrate it is something I will look at very soon). Once we have the above, we can then go through and add each of these changes one at a time. That way we should have clear data about the improvement for each case. Further, #1433 & #1138 are partially addressing #1396 & #1397 which will lead us down the path towards using more native shell features. I am not sure if changes for those will clash with what you have here, but be prepared for some rework here. This draft PR is a great place for us to discuss your performance improvements and ideas, so please keep this open, keep updating it, and I will hopefully have time to review the rest of it soon. Thanks! This is exciting work! |
Awesome! Performance increases are a much needed fix. I know #1433 was mentioned, and as its author, I know that PR is pretty big, but I only fixed formatting (mostly), for only files in I've identified some Bash incompatibilities, and some possible improvements. There might be some disagreement of some sort about the first suggestion, so maybe that one can be hashed out once the mentioned additional perf test is integrated in the CI pipeline |
@@ -7,6 +7,45 @@ GREP_COLORS= | |||
|
|||
ASDF_DIR=${ASDF_DIR:-''} | |||
ASDF_DATA_DIR=${ASDF_DATA_DIR:-''} | |||
BASH_MAJOR_VERSION=${BASH_VERSION%%.*} | |||
|
|||
fast_basename() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be changed to:?
local path="$1"
path=${path%/}
path=${path##*/}
printf '%s\n' "$REPLY"
The extra tests seem redundant as Bash should perform them internally.
I think it's very worth pursuing having a REPLY
-like pattern as shown here to completely eliminate the subshell (I strongly prefer the variable being called REPLY
, though). In my experience this gives gigantic perf wins, something similar as what it seems you have alluded to in your other comment. I'm not sure how on-board the core-team is with this, though, which is why I think more discussion is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that the additional subshells can be removed, ty for this recommendation.
Re: $REPLY
, I thought that was specifically for read? Using your code sample as-is resulted in an empty output for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is for read
and also select
, I'm borrowing the convention (more info here). It will work if the function looks something like this:
fast_basename() {
unset -v REPLY; REPLY=
local path="$1"
path=${path%/}
path=${path##*/}
REPLY=$path
}
And is called like this:
fast_basename "$executable_path"
local cmd="$REPLY"
The code sample I provided was for existing subshell code, not the REPLY
-thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: Ah, there are times like this that would likely be sped up with the elimination of the subshell.
I ran the two fast_basename types (REPLY and printf) with hyperfine on my M1 Air. I also tried this on a Debian server, but generally was unable to get hyperfine to not issue warnings about interference, no matter the combinations tried. Regardless, the timings were always incredibly close. In this example, printf had a slightly higher max, but was otherwise identical.
❯ cat fast_base*.sh
#!/usr/bin/env bash
fast_basename() {
unset dirpath
local dirpath="$1"
dirpath=${dirpath%/}
dirpath=${dirpath##*/}
printf "%s\n" $dirpath
}
fast_basename "$1"
#!/usr/bin/env bash
fast_basename() {
unset -v REPLY; REPLY=
local path="$1"
path=${path%/}
path=${path##*/}
REPLY=$path
}
fast_basename "$1"
❯ for f in fast_*.sh; do hyperfine -N --warmup 1000 --runs 1000 "./$f $PWD"; done
Benchmark 1: ./fast_basename_printf.sh /Users/sgarland
Time (mean ± σ): 2.7 ms ± 0.4 ms [User: 1.3 ms, System: 0.9 ms]
Range (min … max): 2.5 ms … 5.2 ms 1000 runs
Warning: The first benchmarking run for this command was significantly slower than the rest (3.4 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
Benchmark 1: ./fast_basename_reply.sh /Users/sgarland
Time (mean ± σ): 2.7 ms ± 0.4 ms [User: 1.3 ms, System: 0.9 ms]
Range (min … max): 2.5 ms … 5.1 ms 1000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Re: the best practices linked, I don't understand the reasoning for some of them.
unset REPLY; REPLY=
If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.
unset i
If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:
function foo() {
echo $i
local i=
for _ in {0..9}; do
((i=i+1))
done
echo $i
}
In any case, much of this is quibbling over minute differences, and I'm not that much of a stickler. I care about improving the core performance, and it's clear that you do as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For perf test, you have to include the capture of the variable as a part of the test. So I'm not sure what you're trying to measure when you use printf
without a subshell. These were my results:
script.sh
#!/usr/bin/env bash
fast_basename_work() {
local path="$1"
path=${path%/}
printf '%s\n' "{path##*/}"
}
slow_basename() {
unset -v REPLY
REPLY=$(basename "$1")
}
fast_basename() {
unset -v REPLY
REPLY=$(fast_basename_work "$1")
}
faster_basename() {
unset -v REPLY; REPLY=
local path="$1"
path=${path%/}
path=${path##*/}
REPLY=$path
}
main() {
"$1" "$PWD"
local _result="$REPLY"
}
main "$@"
> for fn in slow_basename fast_basename faster_basename; do hyperfine -N --warmup 1000 --runs 1000 "./script.sh $fn"; done
Benchmark 1: ./script.sh slow_basename
Time (mean ± σ): 3.6 ms ± 0.2 ms [User: 2.0 ms, System: 1.4 ms]
Range (min … max): 3.4 ms … 8.8 ms 1000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 1: ./script.sh fast_basename
Time (mean ± σ): 3.2 ms ± 0.4 ms [User: 1.6 ms, System: 1.4 ms]
Range (min … max): 2.9 ms … 15.1 ms 1000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 1: ./script.sh faster_basename
Time (mean ± σ): 2.7 ms ± 0.3 ms [User: 1.3 ms, System: 1.2 ms]
Range (min … max): 2.5 ms … 8.9 ms 1000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
For your questions about my self-defined "best practices"
If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.
The REPLY=
is to be absolutely certain that REPLY
is what you think it is. For a single function, this doesn't do anything, but if you have hundreds of functions that set REPLY
, this ensures that artifacts of running those other functions don't accidentally interfere with the current function.
The unset -v
is necessary because we want to be absolutely sure that the variable is of the correct type - it's fundamentally more correct. See below:
script.sh
# shellcheck shell=bash
fn1() {
unset -v REPLY
REPLY=
printf 'fn1: %s\n' "$(declare -p REPLY)"
REPLY=(a b)
}
fn2() {
# unset -v REPLY
REPLY=
printf 'fn2: %s\n' "$(declare -p REPLY)"
REPLY=c
}
main() {
fn1
local -a result1="${REPLY[@]}"
fn2
local result2="$REPLY"
}
main "$@"
And I can't just do unset -v REPLY
because for some reason, people like to enable errunset
, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.
If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:
Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so:
otherfn() {
local i=
for i in {0..1}; do :; done
foo
}
So by doing unset -v i
, the goal is to have a behavior closer to traditionally scoped variables. Bash variables (with local
) are function scoped, but more importantly, they are also dynamically scoped, and I wish to account for that.
But yeah, these are minor differences, and I don't wish to prescribe all of these conventions on the ASDF codebase - just the one that actually significantly improves performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not sure what you're trying to measure when you use printf without a subshell.
In the light of day, I also have no idea what I was doing there. Good catch.
And I can't just do unset -v REPLY because for some reason, people like to enable errunset, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.
An excellent point I hadn't thought of. People love to set -euo pipefail
because someone said it was better.
Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so
Again, an excellent point I hadn't thought of.
I think I'm just going to defer to your recommendations. I thought I knew shell reasonably well, but clearly I've a lot more to learn. Thank you for taking the time to demonstrate your answers with scripts.
lib/utils.bash
Outdated
# if it doesn't, use parameter substitution on the path as-is | ||
dirpath="$1" | ||
[[ "$dirpath" == */ ]] && dirpath="${dirpath%/}" && dirpath="${dirpath##*/}" || dirpath="${dirpath##*/}" | ||
printf "$dirpath" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, and other invocations to printf
should have %s\n
as a first parameter to prevent unintentional formatting.
EDIT: I should have said %s\n
since that makes debugging/developing these functions easier - and subshells, process substitution, herestrings automatically append newlines anywyas.
lib/utils.bash
Outdated
outputStr="${inputStr@U}" | ||
;; | ||
"[:upper:]") | ||
outputStr="${inputStr@L}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks since @U
and @L
were added in bash 5.0 while ,,
and ^^
were added in bash 4.0. But the check is still good in principle because we support 3.2 at a minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. I originally had the ^^
syntax but then found out about @U
and used it since it seemed more intuitive. I didn't check the version requirements for it, though.
# exec_paths contains a trailing newline which is converted to a colon, so no | ||
# colon is needed between the subshell and the PATH variable in this string | ||
path="$(tr '\n' ':' <<<"$exec_paths")$PATH" | ||
path="${exec_paths}:$PATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the comment to be true in testing, and this works fine as-is. Open to proof to the contrary.
parse_asdf_version_file
0cf3c3e
to
3d86e29
Compare
lib/utils.bash
Outdated
@@ -892,7 +888,9 @@ remove_path_from_path() { | |||
# Output is a new string suitable for assignment to PATH | |||
local PATH=$1 | |||
local path=$2 | |||
substitute "$PATH" "$path" "" | sed -e "s|::|:|g" | |||
PATH=${PATH//"$path"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an argument to be made for using the substitute call everywhere to make it more easily understandable, but at the very least, here the sed pipe is unnecessary and this can simply be modified twice.
For the record I agree with @jthegedus on this. I'd love to improve performance but I also don't want to be reviewing changes that only improve performance by a small percentage, or changes that only improve things for a small subset of users (not saying that is the case here, I'm just now looking at these changes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new functions appear ok, but whether or not these changes will be accepted depends on how these changes improve the performance of asdf commands. The performance of these particular functions on their own is less important than their affect on the performance of the whole codebase.
Once I finish my other PRs I will work on the performance automation for this repo. |
I found and fixed a bug I had introduced with |
Happy 1 year anniversary for this PR! Sic. |
I got frustrated trying to fix bugs only present in esoteric shells like elvish, and around the same time, shifted to mise (née rtx). Sorry. I might pick it up again once my professional life has calmed down some. |
Summary
This is a draft for now. It pulls in the patch I linked here, as well as some other work. I mostly wanted to get people's opinion on it early before I continue with the entire project.
In general, it speeds up performance by reducing fork calls, either by combining various combinations of
sed | grep | cut
etc. into a singleawk
call, or where possible, using shell built-ins like parameter substitution and modification.Fixes
It addresses but does not fix issue 290
Other Information
I haven't ran the full test suite yet, but I know at least one (
where
) fails; I believe this to be a false positive, and will address it.