Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce runtime of go Encode() implementation by 20% #566

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

willbeason
Copy link

This is a performance optimization change for Encode().

As-is, Encode() always calls a stringconcat (by "+"-ing strings together). This call is unnecessary as we know the structure of the location codes ahead of time, and non-performant as stringconcat is slow compared to slicebytetostring.

To do this, I made the initial byte slice allocated represent the entire final code and immediately fill in the separator. This means later, when we format the code, we don't need to concatenate the Separator in as it is already there.

I introduce a special case for the location pair after the separator rather than having those bytes computed in the loop: otherwise we need to have an if-statement within the loop. As a bonus, having these computed first let us remove the two unnecessary divides that formerly happened at the end of the final loop. (I am unsure if this last point is noticeable - I could not detect a significant difference.)

I haven't bothered with optimizing the case for codes which need padding before the separator (or which do not use the full 15 non-separator characters) - given that the benchmark does not deal with such cases, I assumed such cases are far less used.

Performance difference on my machine:

Before:

will@Janeway:~/GolandProjects/open-location-code/go$ go test --run=NONE --bench=Encode --benchtime=10s -cpuprofile=/home/will/cpu2.out
goos: linux
goarch: amd64
pkg: github.com/google/open-location-code/go
cpu: AMD Ryzen Threadripper 3970X 32-Core Processor 
BenchmarkEncode-64    	163811120	        72.50 ns/op	      16 B/op	       1 allocs/op
PASS
ok  	github.com/google/open-location-code/go	22.076s

After:

will@Janeway:~/GolandProjects/open-location-code/go$ go test --run=NONE --bench=Encode --benchtime=10s -cpuprofile=/home/will/after.out
goos: linux
goarch: amd64
pkg: github.com/google/open-location-code/go
cpu: AMD Ryzen Threadripper 3970X 32-Core Processor 
BenchmarkEncode-64    	198057280	        56.89 ns/op	      16 B/op	       1 allocs/op
PASS
ok  	github.com/google/open-location-code/go	20.486s

I've attached cpu profiles of before/after as well.

before
after

This is a performance optimization change for Encode().

As-is, Encode() always calls a stringconcat (by "+"-ing strings together). This call is unnecessary as we know the structure of the location codes ahead of time, and non-performant as stringconcat is slow compared to slicebytetostring.

To do this, I made the initial byte slice allocated represent the entire final code and immediately fill in the separator. This means later, when we format the code, we don't need to concatenate the Separator in as it is already there.

I introduce a special case for the location pair after the separator rather than having those bytes computed in the loop: otherwise we need to have an if-statement within the loop. As a bonus, having these computed first let us to remove the two unnecessary divides that formerly happened at the end of the final loop. (I am unsure if this last point is noticeable - I could not detect a significant difference.)
Copy link

google-cla bot commented Nov 8, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@willbeason
Copy link
Author

Okay, signed CLA.

@willbeason
Copy link
Author

willbeason commented Nov 9, 2023

Playing around with the code, I found a second optimization that improves performance an additional 10%. I'll include it as a follow-up, separate PR as it could potentially be controversial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant