Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: speed improvements #1441

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

stephanGarland
Copy link

Summary

This is a draft for now. It pulls in the patch I linked here, as well as some other work. I mostly wanted to get people's opinion on it early before I continue with the entire project.

In general, it speeds up performance by reducing fork calls, either by combining various combinations of sed | grep | cut etc. into a single awk call, or where possible, using shell built-ins like parameter substitution and modification.

Fixes

It addresses but does not fix issue 290

Other Information

I haven't ran the full test suite yet, but I know at least one (where) fails; I believe this to be a false positive, and will address it.

@stephanGarland stephanGarland requested a review from a team as a code owner January 20, 2023 04:58
@stephanGarland stephanGarland marked this pull request as draft January 20, 2023 04:58
@stephanGarland stephanGarland changed the title Draft: speed improvements perf: speed improvements Jan 20, 2023
lib/utils.bash Outdated
@@ -190,7 +229,8 @@ find_versions() {
version=$(get_version_from_env "$plugin_name")
if [ -n "$version" ]; then
local upcase_name
upcase_name=$(printf "%s\n" "$plugin_name" | tr '[:lower:]-' '[:upper:]_')
tr_args=("$plugin_name" "[:lower:]" "[:upper:]" "-" "_")
upcase_name=$(fast_tr ${tr_args[@]})
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the subshell this is at best equally as fast as before. Probably a way to do this without a subshell.

@jthegedus
Copy link
Contributor

jthegedus commented Jan 20, 2023

@stephanGarland A few thoughts after a quick glance.

Before we introduce changes like these I would like to standardize and have in our CI pipelines performance testing and reporting to PRs. That way we identify the actual performance uplift of a specific change. (https://github.com/sharkdp/hyperfine is my current candidate for this automation. How to integrate it is something I will look at very soon).

Once we have the above, we can then go through and add each of these changes one at a time. That way we should have clear data about the improvement for each case.

Further, #1433 & #1138 are partially addressing #1396 & #1397 which will lead us down the path towards using more native shell features. I am not sure if changes for those will clash with what you have here, but be prepared for some rework here.

This draft PR is a great place for us to discuss your performance improvements and ideas, so please keep this open, keep updating it, and I will hopefully have time to review the rest of it soon.

Thanks! This is exciting work!

@hyperupcall
Copy link
Contributor

hyperupcall commented Jan 22, 2023

Awesome! Performance increases are a much needed fix.

I know #1433 was mentioned, and as its author, I know that PR is pretty big, but I only fixed formatting (mostly), for only files in test/*, so there should be little to no conflicts. Same with the other PR.

I've identified some Bash incompatibilities, and some possible improvements. There might be some disagreement of some sort about the first suggestion, so maybe that one can be hashed out once the mentioned additional perf test is integrated in the CI pipeline

@@ -7,6 +7,45 @@ GREP_COLORS=

ASDF_DIR=${ASDF_DIR:-''}
ASDF_DATA_DIR=${ASDF_DATA_DIR:-''}
BASH_MAJOR_VERSION=${BASH_VERSION%%.*}

fast_basename() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be changed to:?

local path="$1"
path=${path%/}
path=${path##*/}
printf '%s\n' "$REPLY"

The extra tests seem redundant as Bash should perform them internally.

I think it's very worth pursuing having a REPLY-like pattern as shown here to completely eliminate the subshell (I strongly prefer the variable being called REPLY, though). In my experience this gives gigantic perf wins, something similar as what it seems you have alluded to in your other comment. I'm not sure how on-board the core-team is with this, though, which is why I think more discussion is needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that the additional subshells can be removed, ty for this recommendation.

Re: $REPLY, I thought that was specifically for read? Using your code sample as-is resulted in an empty output for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is for read and also select, I'm borrowing the convention (more info here). It will work if the function looks something like this:

fast_basename() {
  unset -v REPLY; REPLY=

  local path="$1"
  path=${path%/}
  path=${path##*/}
  REPLY=$path
}

And is called like this:

fast_basename "$executable_path"
local cmd="$REPLY"

The code sample I provided was for existing subshell code, not the REPLY-thing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: Ah, there are times like this that would likely be sped up with the elimination of the subshell.

I ran the two fast_basename types (REPLY and printf) with hyperfine on my M1 Air. I also tried this on a Debian server, but generally was unable to get hyperfine to not issue warnings about interference, no matter the combinations tried. Regardless, the timings were always incredibly close. In this example, printf had a slightly higher max, but was otherwise identical.

❯ cat fast_base*.sh
#!/usr/bin/env bash

fast_basename() {
  unset dirpath
  local dirpath="$1"
  dirpath=${dirpath%/}
  dirpath=${dirpath##*/}
  printf "%s\n" $dirpath
}

fast_basename "$1"
#!/usr/bin/env bash

fast_basename() {
  unset -v REPLY; REPLY=
  local path="$1"
  path=${path%/}
  path=${path##*/}
  REPLY=$path
}

fast_basename "$1"

❯ for f in fast_*.sh; do hyperfine -N --warmup 1000 --runs 1000 "./$f $PWD"; done
Benchmark 1: ./fast_basename_printf.sh /Users/sgarland
  Time (mean ± σ):       2.7 ms ±   0.4 ms    [User: 1.3 ms, System: 0.9 ms]
  Range (min … max):     2.5 ms …   5.2 ms    1000 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (3.4 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

Benchmark 1: ./fast_basename_reply.sh /Users/sgarland
  Time (mean ± σ):       2.7 ms ±   0.4 ms    [User: 1.3 ms, System: 0.9 ms]
  Range (min … max):     2.5 ms …   5.1 ms    1000 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Re: the best practices linked, I don't understand the reasoning for some of them.

unset REPLY; REPLY=

If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.

unset i

If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:

function foo() {
  echo $i
  local i=
  for _ in {0..9}; do 
    ((i=i+1))
  done
  echo $i
}

In any case, much of this is quibbling over minute differences, and I'm not that much of a stickler. I care about improving the core performance, and it's clear that you do as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For perf test, you have to include the capture of the variable as a part of the test. So I'm not sure what you're trying to measure when you use printf without a subshell. These were my results:

script.sh
#!/usr/bin/env bash

fast_basename_work() {
	local path="$1"
	path=${path%/}
	printf '%s\n' "{path##*/}"
}

slow_basename() {
	unset -v REPLY
	REPLY=$(basename "$1")
}

fast_basename() {
	unset -v REPLY
	REPLY=$(fast_basename_work "$1")
}

faster_basename() {
	unset -v REPLY; REPLY=
	local path="$1"
	path=${path%/}
	path=${path##*/}
	REPLY=$path
}

main() {
	"$1" "$PWD"
	local _result="$REPLY"
}

main "$@"
> for fn in slow_basename fast_basename faster_basename; do hyperfine -N --warmup 1000 --runs 1000 "./script.sh $fn"; done
Benchmark 1: ./script.sh slow_basename
  Time (mean ± σ):       3.6 ms ±   0.2 ms    [User: 2.0 ms, System: 1.4 ms]
  Range (min … max):     3.4 ms …   8.8 ms    1000 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 1: ./script.sh fast_basename
  Time (mean ± σ):       3.2 ms ±   0.4 ms    [User: 1.6 ms, System: 1.4 ms]
  Range (min … max):     2.9 ms …  15.1 ms    1000 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 1: ./script.sh faster_basename
  Time (mean ± σ):       2.7 ms ±   0.3 ms    [User: 1.3 ms, System: 1.2 ms]
  Range (min … max):     2.5 ms …   8.9 ms    1000 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

For your questions about my self-defined "best practices"

If you unset a var (or function), it's gone. In C you may need to then initialize the empty var to a known value, but this isn't C.

The REPLY= is to be absolutely certain that REPLY is what you think it is. For a single function, this doesn't do anything, but if you have hundreds of functions that set REPLY, this ensures that artifacts of running those other functions don't accidentally interfere with the current function.

The unset -v is necessary because we want to be absolutely sure that the variable is of the correct type - it's fundamentally more correct. See below:

script.sh
# shellcheck shell=bash

fn1() {
	unset -v REPLY
	REPLY=

	printf 'fn1: %s\n' "$(declare -p REPLY)"
	REPLY=(a b)
}

fn2() {
	# unset -v REPLY
	REPLY=

	printf 'fn2: %s\n' "$(declare -p REPLY)"
	REPLY=c
}

main() {
	fn1
	local -a result1="${REPLY[@]}"

	fn2
	local result2="$REPLY"
}

main "$@"

And I can't just do unset -v REPLY because for some reason, people like to enable errunset, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.

If it's function-scoped, why bother unsetting it? This is always going to print a null for the first echo, and 10 for the second:

Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so:

otherfn() {
  local i=
  for i in {0..1}; do :; done

  foo
}

So by doing unset -v i, the goal is to have a behavior closer to traditionally scoped variables. Bash variables (with local) are function scoped, but more importantly, they are also dynamically scoped, and I wish to account for that.

But yeah, these are minor differences, and I don't wish to prescribe all of these conventions on the ASDF codebase - just the one that actually significantly improves performance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm not sure what you're trying to measure when you use printf without a subshell.

In the light of day, I also have no idea what I was doing there. Good catch.

And I can't just do unset -v REPLY because for some reason, people like to enable errunset, and since I'm usually writing Bash libraries, I need to be compatible for the options that they wish to set.

An excellent point I hadn't thought of. People love to set -euo pipefail because someone said it was better.

Yeah, it's function scoped, and I still don't want that variable to leak into the callers of functions that are called later in that function like so

Again, an excellent point I hadn't thought of.

I think I'm just going to defer to your recommendations. I thought I knew shell reasonably well, but clearly I've a lot more to learn. Thank you for taking the time to demonstrate your answers with scripts.

lib/utils.bash Outdated
# if it doesn't, use parameter substitution on the path as-is
dirpath="$1"
[[ "$dirpath" == */ ]] && dirpath="${dirpath%/}" && dirpath="${dirpath##*/}" || dirpath="${dirpath##*/}"
printf "$dirpath"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and other invocations to printf should have %s\n as a first parameter to prevent unintentional formatting.

EDIT: I should have said %s\n since that makes debugging/developing these functions easier - and subshells, process substitution, herestrings automatically append newlines anywyas.

lib/utils.bash Outdated
outputStr="${inputStr@U}"
;;
"[:upper:]")
outputStr="${inputStr@L}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks since @U and @L were added in bash 5.0 while ,, and ^^ were added in bash 4.0. But the check is still good in principle because we support 3.2 at a minimum.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. I originally had the ^^ syntax but then found out about @U and used it since it seemed more intuitive. I didn't check the version requirements for it, though.

# exec_paths contains a trailing newline which is converted to a colon, so no
# colon is needed between the subshell and the PATH variable in this string
path="$(tr '\n' ':' <<<"$exec_paths")$PATH"
path="${exec_paths}:$PATH"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find the comment to be true in testing, and this works fine as-is. Open to proof to the contrary.

lib/utils.bash Outdated
@@ -892,7 +888,9 @@ remove_path_from_path() {
# Output is a new string suitable for assignment to PATH
local PATH=$1
local path=$2
substitute "$PATH" "$path" "" | sed -e "s|::|:|g"
PATH=${PATH//"$path"}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an argument to be made for using the substitute call everywhere to make it more easily understandable, but at the very least, here the sed pipe is unnecessary and this can simply be modified twice.

lib/utils.bash Outdated Show resolved Hide resolved
@Stratus3D
Copy link
Member

Before we introduce changes like these I would like to standardize and have in our CI pipelines performance testing and reporting to PRs.

For the record I agree with @jthegedus on this. I'd love to improve performance but I also don't want to be reviewing changes that only improve performance by a small percentage, or changes that only improve things for a small subset of users (not saying that is the case here, I'm just now looking at these changes).

Copy link
Member

@Stratus3D Stratus3D left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new functions appear ok, but whether or not these changes will be accepted depends on how these changes improve the performance of asdf commands. The performance of these particular functions on their own is less important than their affect on the performance of the whole codebase.

@jthegedus
Copy link
Contributor

Once I finish my other PRs I will work on the performance automation for this repo.

@stephanGarland
Copy link
Author

I found and fixed a bug I had introduced with parse_asdf_version_file that was causing a ton of failures. There are still test failures (34/352, but tbf some of those are from fish and elvish simply because I don't have them installed yet), but the number is greatly reduced.

Riatre added a commit to Riatre/dotfilez that referenced this pull request Jul 26, 2023
@ssbarnea
Copy link

Happy 1 year anniversary for this PR! Sic.

@stephanGarland
Copy link
Author

Happy 1 year anniversary for this PR! Sic.

I got frustrated trying to fix bugs only present in esoteric shells like elvish, and around the same time, shifted to mise (née rtx). Sorry. I might pick it up again once my professional life has calmed down some.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants