Drop base64-js dependency, and optimise Buffer.write #349

chjj · 2024-02-04T17:52:42Z

Most string write functions (with the exception of hexWrite) were allocating arrays before copying them onto the buffer. This resulted in significant performance overhead.

This PR modifies said functions to write directly to the buffer. In addition, it drops the dependency on base64-js and adds support for the base64url encoding.

The numbers are as follows:

$ node perf/write-base64.js
OldBuffer.byteLength(8192, "base64") x 24,009 ops/sec ±0.55% (93 runs sampled)
NewBuffer.byteLength(8192, "base64") x 9,548,372 ops/sec ±0.18% (95 runs sampled)

OldBuffer#write(8192, "base64") x 19,789 ops/sec ±0.49% (95 runs sampled)
NewBuffer#write(8192, "base64") x 22,646 ops/sec ±0.04% (95 runs sampled)

OldBuffer#toString(8192, "base64") x 6,686 ops/sec ±0.39% (87 runs sampled)
NewBuffer#toString(8192, "base64") x 10,735 ops/sec ±0.35% (90 runs sampled)

$ node perf/write-binary.js
OldBuffer#write(8192, "binary") x 13,662 ops/sec ±1.37% (84 runs sampled)
NewBuffer#write(8192, "binary") x 46,397 ops/sec ±0.64% (91 runs sampled)

$ node perf/write-ucs2.js
OldBuffer#write(8192, "ucs2") x 3,537 ops/sec ±0.70% (86 runs sampled)
NewBuffer#write(8192, "ucs2") x 40,422 ops/sec ±1.14% (93 runs sampled)

$ node perf/write-utf8.js
OldBuffer.byteLength(2898, "utf8") x 15,304 ops/sec ±0.87% (85 runs sampled)
NewBuffer.byteLength(2898, "utf8") x 58,197 ops/sec ±0.05% (86 runs sampled)

OldBuffer#write(2898, "utf8") x 13,565 ops/sec ±0.82% (85 runs sampled)
NewBuffer#write(2898, "utf8") x 32,716 ops/sec ±0.25% (93 runs sampled)

There are some massive speedups here, particularly with asciiWrite, ucs2Write, and utf8Write.

I was disappointed to see base64 only has a marginal speedup, however, there is a significant speedup for base64Slice and byteLength('base64'). Likewise, byteLength('utf8') also received a significant speedup.

chjj · 2024-02-04T20:00:39Z

Added a new commit to further optimize base64Write. This logic is basically how the node.js c++ code works internally. It requires no string preprocessing (str.replace, str.split, str.indexOf, etc) which gives it an extra perf boost.

New Numbers:

OldBuffer#write(8192, "base64") x 19,789 ops/sec ±0.49% (95 runs sampled)
NewBuffer#write(8192, "base64") x 24,542 ops/sec ±0.02% (97 runs sampled)

dcousens · 2024-02-04T23:50:31Z

@chjj can you break this into two pull requests?
The addition of all the new benchmarks (easy merge), and then an isolated pull request for the actual behavioural changes?

chjj · 2024-02-05T03:07:07Z

Done. Force pushed so this is now the isolated PR.

edit: I can rebase on master and force push again once you merge #352 if need be.

chjj · 2024-02-05T12:44:40Z

New base64 implementation using atob/btoa pushed.

chjj · 2024-02-05T13:16:52Z

index.js

+if (typeof global !== 'undefined' && global && global.Math === Math &&
+    typeof global.Buffer === 'function' && global.Buffer.prototype &&
+    typeof global.Buffer.prototype.asciiSlice === 'function') {
+  const NodeBuffer = global.Buffer
+
+  _atob = function atob (str) {
+    return NodeBuffer.from(str, 'base64').toString('binary')
+  }
+
+  _btoa = function btoa (str) {
+    return NodeBuffer.from(str, 'binary').toString('base64')
+  }
+}


This shim leads to some untested code paths when testing in node.js (e.g. atob won't throw on a base64url string like it should).

Unfortunately there's no feature in tape to set a global or somehow signal to this piece of code that we are testing and it should use the atob/btoa functions provided by node.js instead of shimming.

I see a few solutions:

Create tape-wrapper.js which does something like: global.BUFFER_TEST = true; require('tape/bin/tape'); and add && !global.BUFFER_TEST to the already-gnarly if statement. Run tests with ./tape-wrapper.js instead of tape.

Rewrite the tests in mocha which allows you to assign globals on boot.

Only shim atob/btoa if typeof atob === 'undefined'. This fixes the tests and maintains functionality for pre-16.0.0 nodes, but creates a weird situation where older versions of node are actually faster than newer ones.

Remove this entire thing. Always use atob/btoa. Tests will be fixed. Older nodes will break. Newer nodes will be slow. Benchmarks run inside node.js will be inaccurate.

Beseech the node.js devs to fix atob/btoa which eventually does away with all the problems I just listed.

Switch back to the custom base64 implementation which is almost as fast as atob in browsers and replicates the exact node.js behavior with no downgrade in perf on base64url etc.

Just occurred to me that we could do something hacky like checking:

&& global.process.argv[1].slice(-4) !== 'tape'

to exclude testing from the shims.

chjj · 2024-02-05T14:24:07Z

After thinking about it, all of this atob/btoa inconsistency is really bugging me. I'm considering switching back to the custom base64 impl. Although harder to review, it provides consistent behavior that we know will work in everything and has predictable performance.

chjj · 2024-02-07T02:01:14Z

@dcousens, I think I will switch this PR back to my custom base64 implementation.

The more I think about it, the more I don't like atob/btoa for the following reasons:

It's hard to benchmark due to the node.js issue. The numbers may vary significantly in different browsers as well.
It screws up the tests if we want to shim in atob/btoa for node.js.
The node.js shim also causes issues because node.js has a history of breaking their base64 parser.
With regards to number 3, same could be true of atob in any javascript environment: we don't know whether that environment has some kind of bug/inconsistency with their base64 parser. We also don't know the performance properties of that environment's base64 decoder.
With atob, buf.write('base64') is basically guaranteed to be slower on a lot of base64url strings or a string with invalid characters because it has to run base64clean and retry parsing.
Using btoa is just straight up slower than our custom implementation in any environment (probably).
There are environments you could bundle for that do not have atob/btoa. An example of this is quickjs. If I want to write some javascript code and compile it to a binary that runs everywhere, quickjs combined with cosmopolitan is the answer. By using atob/btoa, we break Buffer in that environment.
It goes against the whole spirit of this PR which is to write bytes directly to the buffer during decoding.

Number 7 is one I recently thought of. I really really like the idea of compiling bundles for quickjs (I'm thinking of adding a quickjs target to my bundler to convert the quickjs api to the node.js api, but that's a different story).

The benefits of a custom implementation are as follows:

Predictable performance.
Predictable behavior.
Faster toString.
Supports any javascript environment, including things like quickjs and the d8 shell.
Less fussing over all of the stuff above.

The downsides:

Slower buf.write(str, 'base64') in environments like chrome (but again, chrome would only be significantly faster if the string is well-formed enough such that atob can parse it).
Larger code size, harder to review.

dcousens · 2024-02-07T02:42:46Z

I understand your plight for each of the points, but I don't know if this library should be a reference for this kind of functionality.
This library should be using whatever underlying native functions exist, as that is what users expect.

The primary reason this library exists is to support cross-compatibility of the native API's, not to be an improved implementation of their shared functionality.

With atob, buf.write('base64') is basically guaranteed to be slower on a lot of base64url strings or a string with invalid characters because it has to run base64clean and retry parsing.

I don't understand this, we should know that the input string is base64url no?

This pull request is difficult to quickly review in the limited time I have, when you have touched and moved the utf8Write and hexWrite and ucs2Write and... can we reduce that diff?

chjj · 2024-02-07T03:36:52Z

This library should be using whatever underlying native functions exist, as that is what users expect.

Fair enough.

I don't understand this, we should know that the input string is base64url no?

We can use that as a hint, but the Buffer.from('base64') in node has supported the base64url character set for a long time. It's not always guaranteed that a user would pass in base64 versus base64url.

This pull request is difficult to quickly review in the limited time I have, when you have touched and moved the utf8Write and hexWrite and ucs2Write and... can we reduce that diff?

Yeah, I know I've been bugging you a lot lately. Sorry about that. Thank you for your attentiveness this past week.

I find it's best to look at the diff of each individual commit, but I see your point. I'll try to get it looking better.

chjj force-pushed the optimize-string-decoding branch from b6021f4 to 42bc6ba Compare February 5, 2024 03:06

dcousens changed the title ~~Optimize string decoding (among other things)~~ Drop base64-js dependency, and optimise Buffer.write Feb 5, 2024

dcousens mentioned this pull request Feb 5, 2024

Remove dependency on base64-js #339

Open

dcousens self-requested a review February 5, 2024 04:22

chjj added 5 commits February 5, 2024 06:45

Add another sanity check for Buffer#write

1f9cdc9

Simplify hexWrite

14af73f

Optimize asciiWrite

b5742b6

Optimize ucs2Write

922b89e

Optimize utf8Write

39d2fa9

chjj force-pushed the optimize-string-decoding branch from 42bc6ba to 5be2b22 Compare February 5, 2024 12:39

chjj force-pushed the optimize-string-decoding branch from 5be2b22 to 61582b4 Compare February 5, 2024 12:53

Use atob/btoa for base64, remove base64-js dependency

61582b4

chjj commented Feb 5, 2024

View reviewed changes

chjj mentioned this pull request Feb 6, 2024

buffer: fix DoS vector in atob nodejs/node#51670

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop base64-js dependency, and optimise Buffer.write #349

Drop base64-js dependency, and optimise Buffer.write #349

chjj commented Feb 4, 2024

chjj commented Feb 4, 2024

dcousens commented Feb 4, 2024

chjj commented Feb 5, 2024 •

edited

chjj commented Feb 5, 2024

chjj Feb 5, 2024 •

edited

chjj Feb 5, 2024 •

edited

chjj commented Feb 5, 2024

chjj commented Feb 7, 2024 •

edited

dcousens commented Feb 7, 2024 •

edited

chjj commented Feb 7, 2024

Drop base64-js dependency, and optimise Buffer.write #349

Are you sure you want to change the base?

Drop base64-js dependency, and optimise Buffer.write #349

Conversation

chjj commented Feb 4, 2024

chjj commented Feb 4, 2024

dcousens commented Feb 4, 2024

chjj commented Feb 5, 2024 • edited

chjj commented Feb 5, 2024

chjj Feb 5, 2024 • edited

Choose a reason for hiding this comment

chjj Feb 5, 2024 • edited

Choose a reason for hiding this comment

chjj commented Feb 5, 2024

chjj commented Feb 7, 2024 • edited

dcousens commented Feb 7, 2024 • edited

chjj commented Feb 7, 2024

chjj commented Feb 5, 2024 •

edited

chjj Feb 5, 2024 •

edited

chjj Feb 5, 2024 •

edited

chjj commented Feb 7, 2024 •

edited

dcousens commented Feb 7, 2024 •

edited