Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding support for brolti #173

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

nicksrandall
Copy link

@nicksrandall nicksrandall commented Sep 15, 2020

Fixes #71

TL;DR

There has been several attempts (#172, #158) to add brotli compression support to this library and so far none have been successfully merged (for various reasons). This PR is yet another attempt to support brotli compression in a backwards compatible way while also reducing existing code change as much as possible to ensure stability of the library.

What is the problem that is trying to be solved?

The brotli compression algorithm is generally more efficient than gzip and now supported in recent versions of Node.js and in all major browsers. Http clients (like web browsers) can specify an "Accept-Encoding" header in their requests to share which compression algorithms they support and prefer. According to the spec, when two values have the same preference, the first value will be used (seems reasonable right?).

However, many browsers (including Google Chrome) send the "br" (brotli) value last even though it is generally preferred to the other algorithms (ie they send gzip, deflate, br instead of br, gzip, deflate). This causes the brotli compression algorithm to be de-prioritized and unused.

What can we do about it?

We can follow a well-established convention to deviate from the spec slightly and force the server to choose the "preferred" compression algorithm when the client has (basically) stated that it doesn't not explicitly prefer one algorithm over another.

So, for example, if we get an "Accept-Encoding" header value of gzip, deflate, br we will use br (brotli) because brotli is more efficient over gzip and their preference (set by client) is both 1 (the default value).

However, if we get gzip;q=0.8, br;q=0.1 (q is level of preference from 0 to 1 where 1 is most preferred) we will use 'gzip' because the client has explicitly stated that it is more preferred than brotli.

@nicksrandall
Copy link
Author

@dougwilson this is ready for review... I think you'll like this PR much better

@nicksrandall
Copy link
Author

@dougwilson what do you think of the direction in this PR? Anything I can do to help you with the QA/Approval process?

@dougwilson
Copy link
Contributor

dougwilson commented Sep 17, 2020

Hi @nicksrandall apologies if I didn't mention it before: sorry I haven't been able to take a look just yet; I am currently spending all my allocated OS time right now addressing some security issues that have been reported. Please give me about a week to get everything settled and I will be taking a look, thank you 🙏 -- I've noticed you have made about 6+ pushes to this branch even after you noted it was ready for review; it seems like it is still having some edge cases sorted out anyhow 😂

@nicksrandall
Copy link
Author

Hi @dougwilson not trying to be pushy, just checking in to see where we are at with this PR? Any change you'll have some time to review anytime soon?

@nicksrandall
Copy link
Author

The only potential issue I can see with this change is that I'm cloning the request object on every request before I sent it to the accepts library to determine which encoding method should be used. As far as I know, the accepts library only uses the headers property on the request object so I could only send the cloned (and altered) headers to the accepts library with the rest of the request object being omitted. Therefore I would have less to clone so that might be a bit more performant but it could break something if accepts library decides to use other parts of the request in the future.

@dougwilson
Copy link
Contributor

The security issue is still on going at this time, I apologize. Github is the reporter and I have been working with their security group still on the situation.

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am on my phone, but tried to do review what I could to help move it along. It is not comprehensive and I didn't actually run anything, so please forgive me, but I thought that I could at least use my time here waiting in the dr. office to do something for you since you pinged me :)

index.js Outdated
method = accept.encoding(['gzip', 'identity'])
}
// compression method
var accept = accepts(objectAssign({}, res, { headers: headers }))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the accepts API expects the request object, not a response object clone.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is a bug (although it didn't actually effect anything because accepts library only reads the headers). This has been fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is even cloning the res correctly, anyhow; it seems to only copy over the own, enumerable properties; all the methods from the prototype and such are lost, for instance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this code so that we don't pass in the entire (cloned) request and instead with just pass in the cloned headers. Thats all the accepts function reads from anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that of course does end up violating the API contract of that dependency. We should either hard-pin that dependency so that we have to manually bump the version for any changes since this would now be relying on the internal behavior of that dependency to not access anything in outside of that one property, or perhaps get the dependency to change/clarify the contract to only being the headers. If we do go the pin dependency route, we'll want to check that if it then passes the req object to a dependency of it's own, that that is pinned within that module as well. This will prevent the possibility of sudden breakage when dependencies, dependencies of dependencies, etc. update

*
* @private
*/
function prioritize (str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, it seems like this can be an option added to the accept negotiation logic directly? Like an option to prefer server order over client order? I think that is effectively what this is doing, or am I misunderstanding?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I have added a priorizeClient option that will allow users to opt-out of this functionality.

For what it's worth, we were already using this behavior because we were prioritizing GZIP over deflate even when deflate came first. As such, I think "opt-out" vs "opt-in" is acceptable in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I think I didn't explain very well my thought on this; it wasn't to add a parameter here to this library's API, but rather to the accepts API. As far as adding this as a parameter to this module's API, I'm not sure if it's actually necessary, though if it is going to be surfaced as a switch, I would kind of expect such a switch to default to the spec compliant method. But again, I'm not sure if this actually needs to be a switch for the user; a better user switch would actually be to turn on/off each individual compression method, but that would be out of scope for this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying but we were already out of spec because we were prioritizing gzip over deflate even when deflate came first. The option to be fully spec compliant would be a new feature for this library.

I also will argue that most people do not want the spec compliant behavior and they will prefer the default option that we have here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are miss understanding what I am saying on this comment again, I'm really sorry. I must not be good at communicating my idea here through writing and that is my bad. I tried twice and it wasn't successful :( Perhaps if you desiure to move this forward would you be willing to do some kind of voice call?

index.js Outdated
* @private
*/
function sortEncodings (a, b) {
if (a.indexOf('br') >= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding are case insensitive according to the spec, but this us treating them as case sensitive.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another great catch, I now lowercase the value of the header before I look for any specific values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to now violate the spec in a different way: the q parameter name is case-sensitive, so br;q=0 header would have br with a quality of 0, but br;Q=0 would have a quality of 1. This change you made will now accidentally treat the latter as the former.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying. I've updated the code so that this sorting function will sort the items in the header but it will not change their original case. It will also work with any casing of encoding values.

Copy link
Contributor

@dougwilson dougwilson Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why it is using .indexOf instead of just a comparison? I would think .indexOf would identify bro as br or similar. I'm not sure what that would all entail. I glanced again at the spec and this header doesn't allow random parameters to be specified, so it does seem unlikely to get mixed up in that way, but when looking, I realize that this header does accept the special * value. I know this sorting does not take that * into account there, so I wonder if that will cause issues with the logic? I tried a few different headers to see what the outcomes are and was curious if these make sense:

* : Results in br encoding
*, gzip: Results in gzip encoding
gzip, *: Results in gzip encoding

@Abhijith-net
Copy link

@nicksrandall Thanks for this change, is there a plan for closing this. We are planning to go for production.

index.js Show resolved Hide resolved
Co-authored-by: Matthias Le Brun <mlbli@me.com>
@Kisama
Copy link

Kisama commented Feb 26, 2021

Is there any plan for close this PR?

I'm waiting for this feature to release

@nicksrandall
Copy link
Author

I think that #172 is more likely to land. I'll close this PR as soon as that PR lands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Brotli compression
7 participants