-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: speed improvements #1441
base: master
Are you sure you want to change the base?
perf: speed improvements #1441
Changes from 2 commits
417c122
009ab6e
f410204
f585713
0084332
3d86e29
1a6d8b9
08beda0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,45 @@ GREP_COLORS= | |
|
||
ASDF_DIR=${ASDF_DIR:-''} | ||
ASDF_DATA_DIR=${ASDF_DATA_DIR:-''} | ||
BASH_MAJOR_VERSION=${BASH_VERSION%%.*} | ||
|
||
fast_basename() { | ||
# if the path ends in a slash, remove it, then use parameter substitution | ||
# if it doesn't, use parameter substitution on the path as-is | ||
dirpath="$1" | ||
[[ "$dirpath" == */ ]] && dirpath="${dirpath%/}" && dirpath="${dirpath##*/}" || dirpath="${dirpath##*/}" | ||
printf "$dirpath" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This, and other invocations to EDIT: I should have said |
||
} | ||
|
||
fast_tr() { | ||
inputArr=("$@") | ||
inputStr="${inputArr[0]}" | ||
local outputStr | ||
if [ ! ${#inputArr[@]} -eq 3 ] && [ ! ${#inputArr[@]} -eq 5 ]; then | ||
printf "%s\n" "ERROR: fast_tr expects an array of 3 or 5 elements, got ${#inputArr[@]}" | ||
return 1 | ||
fi | ||
# if bash >= v4, use parameter modification to upper/lower the string | ||
if [ ${BASH_MAJOR_VERSION} -gt 3 ]; then | ||
case ${inputArr[1]} in | ||
"[:lower:]") | ||
outputStr="${inputStr@U}" | ||
;; | ||
"[:upper:]") | ||
outputStr="${inputStr@L}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This breaks since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, good catch. I originally had the |
||
;; | ||
"*") | ||
printf "ERROR: fast_tr() expects [:lower:] and [:upper:] as arguments" | ||
esac | ||
# and then use parameter substitution to replace characters in the string | ||
# if the 4th or 5th elements are not set, the substitution has no effect | ||
stephanGarland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
outputStr="${outputStr//${inputArr[3]-''}/${inputArr[4]-''}}" | ||
printf "${outputStr}" | ||
else | ||
# else use tr with the input array elements as args | ||
printf "${inputStr}" | tr "${inputArr[1]}${inputArr[3]-''}" "${inputArr[2]}${inputArr[4]-''}" | ||
fi | ||
} | ||
|
||
asdf_version() { | ||
local version git_rev | ||
|
@@ -99,7 +138,7 @@ list_installed_versions() { | |
if [ -d "$plugin_installs_path" ]; then | ||
for install in "${plugin_installs_path}"/*/; do | ||
[[ -e "$install" ]] || break | ||
basename "$install" | sed 's/^ref-/ref:/' | ||
fast_basename "$install" | ||
done | ||
fi | ||
} | ||
|
@@ -190,7 +229,8 @@ find_versions() { | |
version=$(get_version_from_env "$plugin_name") | ||
if [ -n "$version" ]; then | ||
local upcase_name | ||
upcase_name=$(printf "%s\n" "$plugin_name" | tr '[:lower:]-' '[:upper:]_') | ||
tr_args=("$plugin_name" "[:lower:]" "[:upper:]" "-" "_") | ||
upcase_name=$(fast_tr ${tr_args[@]}) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Due to the subshell this is at best equally as fast as before. Probably a way to do this without a subshell. |
||
local version_env_var="ASDF_${upcase_name}_VERSION" | ||
|
||
printf "%s\n" "$version|$version_env_var environment variable" | ||
|
@@ -280,7 +320,7 @@ get_custom_executable_path() { | |
|
||
# custom plugin hook for executable path | ||
if [ -x "${plugin_path}/bin/exec-path" ]; then | ||
cmd=$(basename "$executable_path") | ||
cmd=$(fast_basename "$executable_path") | ||
local relative_path | ||
# shellcheck disable=SC2001 | ||
relative_path=$(printf "%s\n" "$executable_path" | sed -e "s|${install_path}/||") | ||
|
@@ -300,7 +340,7 @@ get_executable_path() { | |
|
||
if [ "$version" = "system" ]; then | ||
path=$(remove_path_from_path "$PATH" "$(asdf_data_dir)/shims") | ||
cmd=$(basename "$executable_path") | ||
cmd=$(fast_basename "$executable_path") | ||
cmd_path=$(PATH=$path command -v "$cmd" 2>&1) | ||
# shellcheck disable=SC2181 | ||
if [ $? -ne 0 ]; then | ||
|
@@ -320,8 +360,8 @@ parse_asdf_version_file() { | |
|
||
if [ -f "$file_path" ]; then | ||
local version | ||
version=$(strip_tool_version_comments "$file_path" | grep "^${plugin_name} " | sed -e "s/^${plugin_name} //") | ||
|
||
# this is actually only about 5% faster than the original, despite losing a function call and two pipes | ||
version=$(awk -v plugin_name="$plugin_name" '!/^#/ { gsub(/#.*/,"",$0); gsub(/ +$/,"",$0)} $1 == plugin_name {print $NF}' "$file_path") | ||
if [ -n "$version" ]; then | ||
if [[ "$version" == path:* ]]; then | ||
util_resolve_user_path "${version#path:}" | ||
|
@@ -634,11 +674,15 @@ plugin_shims() { | |
|
||
shim_plugin_versions() { | ||
local executable_name | ||
executable_name=$(basename "$1") | ||
# not sure why this needs a basename | ||
executable_name=$(fast_basename "$1") | ||
local shim_path | ||
shim_path="$(asdf_data_dir)/shims/${executable_name}" | ||
if [ -x "$shim_path" ]; then | ||
grep "# asdf-plugin: " "$shim_path" 2>/dev/null | sed -e "s/# asdf-plugin: //" | uniq | ||
# not sure why uniq was needed here since asdf doesn't let you install the same plugin version twice - maybe for legacy? | ||
# nevertheless, the awk command handles de-duping; if it isn't needed, the sed command is ~3% faster | ||
#sed -n "/# asdf-plugin: /s/# asdf-plugin: //p 2>/dev/null" "$shim_path" | ||
awk '/# asdf-plugin: / { l=$(NF-1) " " $NF; !seen[$0]++; if (seen[$0] < 2) { print l } }' "$shim_path" 2>/dev/null | ||
else | ||
printf "asdf: unknown shim %s\n" "$executable_name" | ||
return 1 | ||
|
@@ -647,11 +691,11 @@ shim_plugin_versions() { | |
|
||
shim_plugins() { | ||
local executable_name | ||
executable_name=$(basename "$1") | ||
executable_name=$(fast_basename "$1") | ||
local shim_path | ||
shim_path="$(asdf_data_dir)/shims/${executable_name}" | ||
if [ -x "$shim_path" ]; then | ||
grep "# asdf-plugin: " "$shim_path" 2>/dev/null | sed -e "s/# asdf-plugin: //" | cut -d' ' -f 1 | uniq | ||
awk '/# asdf-plugin: / { l=$(NF-1) } END { print l} ' "$shim_path" 2>/dev/null | ||
stephanGarland marked this conversation as resolved.
Show resolved
Hide resolved
|
||
else | ||
printf "asdf: unknown shim %s\n" "$executable_name" | ||
return 1 | ||
|
@@ -667,7 +711,9 @@ strip_tool_version_comments() { | |
# 2. Find a # and delete it and everything after the #. | ||
# 3. Remove any whitespace from the end of the line. | ||
# Finally, the command will print the lines that are not empty. | ||
sed '/^[[:blank:]]*#/d;s/#.*//;s/[[:blank:]]*$//' "$tool_version_path" | ||
# this is ~2% faster than the original implementation | ||
# by combining steps 2 and 3 | ||
sed '/^[[:blank:]]*#/d; s/\s*#.*$//;' "$tool_version_path" | ||
} | ||
|
||
asdf_run_hook() { | ||
|
@@ -686,19 +732,18 @@ asdf_run_hook() { | |
get_shim_versions() { | ||
shim_name=$1 | ||
shim_plugin_versions "${shim_name}" | ||
shim_plugin_versions "${shim_name}" | cut -d' ' -f 1 | awk '{print$1" system"}' | ||
shim_plugin_versions "${shim_name%% [0-9]*} system}" | ||
} | ||
|
||
preset_versions() { | ||
shim_name=$1 | ||
shim_plugin_versions "${shim_name}" | cut -d' ' -f 1 | uniq | xargs -IPLUGIN bash -c ". $(asdf_dir)/lib/utils.bash; printf \"%s %s\n\" PLUGIN \$(get_preset_version_for PLUGIN)" | ||
shim_plugin_versions "${shim_name%% [0-9]*}" | xargs -IPLUGIN bash -c ". $(asdf_dir)/lib/utils.bash; printf \"%s %s\n\" PLUGIN \$(get_preset_version_for PLUGIN)" | ||
} | ||
|
||
select_from_preset_version() { | ||
local shim_name=$1 | ||
local shim_versions | ||
local preset_versions | ||
|
||
shim_versions=$(get_shim_versions "$shim_name") | ||
if [ -n "$shim_versions" ]; then | ||
preset_versions=$(preset_versions "$shim_name") | ||
|
@@ -721,7 +766,6 @@ select_version() { | |
|
||
local plugins | ||
IFS=$'\n' read -rd '' -a plugins <<<"$(shim_plugins "$shim_name")" | ||
|
||
for plugin_name in "${plugins[@]}"; do | ||
local version_and_path | ||
local version_string | ||
|
@@ -751,7 +795,7 @@ select_version() { | |
|
||
with_shim_executable() { | ||
local shim_name | ||
shim_name=$(basename "$1") | ||
shim_name=$(fast_basename "$1") | ||
local shim_exec="${2}" | ||
|
||
if [ ! -f "$(asdf_data_dir)/shims/${shim_name}" ]; then | ||
|
@@ -761,7 +805,6 @@ with_shim_executable() { | |
|
||
local selected_version | ||
selected_version="$(select_version "$shim_name")" | ||
|
||
if [ -z "$selected_version" ]; then | ||
selected_version="$(select_from_preset_version "$shim_name")" | ||
fi | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be changed to:?
The extra tests seem redundant as Bash should perform them internally.
I think it's very worth pursuing having a
REPLY
-like pattern as shown here to completely eliminate the subshell (I strongly prefer the variable being calledREPLY
, though). In my experience this gives gigantic perf wins, something similar as what it seems you have alluded to in your other comment. I'm not sure how on-board the core-team is with this, though, which is why I think more discussion is needed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that the additional subshells can be removed, ty for this recommendation.
Re:
$REPLY
, I thought that was specifically for read? Using your code sample as-is resulted in an empty output for me.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is for
read
and alsoselect
, I'm borrowing the convention (more info here). It will work if the function looks something like this:And is called like this:
The code sample I provided was for existing subshell code, not the
REPLY
-thing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: Ah, there are times like this that would likely be sped up with the elimination of the subshell.
I ran the two fast_basename types (REPLY and printf) with hyperfine on my M1 Air. I also tried this on a Debian server, but generally was unable to get hyperfine to not issue warnings about interference, no matter the combinations tried. Regardless, the timings were always incredibly close. In this example, printf had a slightly higher max, but was otherwise identical.
Re: the best practices linked, I don't understand the reasoning for some of them.
If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.
If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:
In any case, much of this is quibbling over minute differences, and I'm not that much of a stickler. I care about improving the core performance, and it's clear that you do as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For perf test, you have to include the capture of the variable as a part of the test. So I'm not sure what you're trying to measure when you use
printf
without a subshell. These were my results:script.sh
For your questions about my self-defined "best practices"
The
REPLY=
is to be absolutely certain thatREPLY
is what you think it is. For a single function, this doesn't do anything, but if you have hundreds of functions that setREPLY
, this ensures that artifacts of running those other functions don't accidentally interfere with the current function.The
unset -v
is necessary because we want to be absolutely sure that the variable is of the correct type - it's fundamentally more correct. See below:script.sh
And I can't just do
unset -v REPLY
because for some reason, people like to enableerrunset
, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so:
So by doing
unset -v i
, the goal is to have a behavior closer to traditionally scoped variables. Bash variables (withlocal
) are function scoped, but more importantly, they are also dynamically scoped, and I wish to account for that.But yeah, these are minor differences, and I don't wish to prescribe all of these conventions on the ASDF codebase - just the one that actually significantly improves performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the light of day, I also have no idea what I was doing there. Good catch.
An excellent point I hadn't thought of. People love to set
-euo pipefail
because someone said it was better.Again, an excellent point I hadn't thought of.
I think I'm just going to defer to your recommendations. I thought I knew shell reasonably well, but clearly I've a lot more to learn. Thank you for taking the time to demonstrate your answers with scripts.