Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix usage of undocumented _implicitHeader #128

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

maritz
Copy link

@maritz maritz commented Jan 10, 2018

As discussed in #127 this PR now only includes the changes that should not break node v0.8 and has tests.

Sadly supertest does not support http2 yet, so I had to resort to vanilla server/client creation.

I'm not destroying the client/server here, not sure if that's needed?!

@maritz

This comment has been minimized.

@maritz

This comment has been minimized.

@dougwilson dougwilson added the pr label Jan 15, 2018
@dougwilson dougwilson self-assigned this Jan 15, 2018
@dougwilson
Copy link
Contributor

Yea, you can go ahead and update the Travis CI file. We just have one entry for the latest minor for each major. I'm not sure if any of the 8.x ones have the http2 module, but at least adding the most recent 9 would work. Make sure to validate all functionality works with 9.x, including no deprecation message are printed, for this and all our dependencies before adding it to Travis CI 👍 that is our standard Travis CI process.

@maritz
Copy link
Author

maritz commented Jan 15, 2018

There is one warning node currently gives when using http2:
(node:4065) ExperimentalWarning: The http2 module is an experimental API.

As far as I can tell neither v8.8 nor v9.4 show any deprecation warnings.

test/compression.js Show resolved Hide resolved
@dougwilson
Copy link
Contributor

Thanks for checking, @maritz . Can you verify that you ran every test suite of even dependency of this module throughout the tree with Node.js 9.x as well to check this? This module doesn't fully exersize all the features of our own dependencies because the assumption is that they already have test suites, so we don't want to copy-and-paste every test suite each module up. Many times the issue is only apparently when doing this, so we have to do this in order to bump any version in the Travis CI configuration 👍

test/compression.js Outdated Show resolved Hide resolved
@maritz
Copy link
Author

maritz commented Jan 15, 2018

Can you verify that you ran every test suite of even dependency of this module throughout the tree with Node.js 9.x as well to check this?

Do you mean going through all 12 dependencies, check out their git repo and manually run their tests? If so, no. I don't have the time to do that right now and quite frankly don't think that's a reasonable thing to do.

Or do you have some tool to do this? Or did I misunderstand?

@dougwilson
Copy link
Contributor

Yes, that is the correct understanding. It's no problem at all, I can do it 👍 I just didn't want to if you already did it. I just use some shell scripts that use jq and git to cloen out the repos and then find + npm to run the tests through them in a single pass. We have been burned multiple times by not doing this, so it's a requirement in order to add new versions. We use the versions in Travis CI as the list of "verified compatible" versions, basically. We've only ever been burned on major upgrades, so this would be mainly for adding 9.x series.

.travis.yml Outdated Show resolved Hide resolved
@dougwilson
Copy link
Contributor

Ok, I ran the tests of everything against Node.js 9.4 and updated the Travis CI in master 👍

@maritz
Copy link
Author

maritz commented Jan 15, 2018

Hm, not sure why the tests are suddenly failing on Travis. Works fine locally with 9.4 and worked fine on Travis before merging the master to resolve the conflicts.

@dougwilson
Copy link
Contributor

Yea, I agree looks like nothing changed. Perhaps the test itself just has a race condition and it just happened to fail this time? If rerunning the test it suddenly passes, probably a flaky test that would need to be debugged.

@maritz
Copy link
Author

maritz commented Jan 15, 2018

Okay, now the tests passed. I'm hoping that these changes actually fixed it. Can't guarantee it though, since I don't truly understand what caused it in the first place.

Anything else I could do?

@maritz
Copy link
Author

maritz commented Jan 17, 2018

@dougwilson Could you manually trigger the travis checks for this PR a couple of times to see if it still fails occasionally? Not a great thing to have to do, but better be safe, right?

@dougwilson
Copy link
Contributor

Yep, I can do that 👍 sorry for the delay, we're just trying to get out a couple security patches out asap. Jan seems to be the season of receiving security reports or something 😂

@maritz

This comment has been minimized.

@dougwilson

This comment has been minimized.

@maritz
Copy link
Author

maritz commented Mar 8, 2018

There were changes in node v9.8.0 that have the potential to maybe fix this... That's about all I got for this right now. :-(

@maritz
Copy link
Author

maritz commented May 6, 2018

I'm actually occasionally seeing this in the one app where I use this PR. I have no indication yet what causes it and it always takes the entire process down.

If anyone has any ideas on how to debug this, I'm very interested in it.

@dougwilson dougwilson force-pushed the master branch 3 times, most recently from d7bb81b to cd957aa Compare May 30, 2018 04:09
@maritz maritz force-pushed the bugfix/use_writeHead_instead_of_implicitHeader branch from e4331cc to 3b4a73e Compare September 26, 2018 14:44
@maritz
Copy link
Author

maritz commented Sep 29, 2018

Just to clarify: The code changes in this PR should be completely fine - they just change _implicitHeader to writeHead.

There is just some problem with how compression works over http2. If you want, I can make yet another PR for just this code change. I don't know if _implicitHeader on http1 is going away anytime soon, so there is probably no hurry.

@DullReferenceException
Copy link

@maritz, @dougwilson: need any help with this PR? I'd definitely prefer using this over rolling our own compression support for HTTP2.

@maritz
Copy link
Author

maritz commented Mar 28, 2019

If you can figure out what is causing the seemingly random failures, that would be great.

I don't have the time and honestly probably lack understanding of how compression and http2 work to fix this properly.

@DullReferenceException
Copy link

@maritz: do you have any hints on what your app does that might be provoking those failures? So far, I've been unable to reproduce that failure. It may be that the version of node 10 is different than what you were using. I see that some fixes (such as nodejs/node#23146) have been made to HTTP2 since you saw this happening.

@maritz
Copy link
Author

maritz commented Mar 29, 2019

do you have any hints on what your app does that might be provoking those failures?

Not really, it was just failing after X amount of time working fine. It's a next.js server that also serves a few static files. So nothing too fancy.

But the Travis CI also had failures sometimes. So I really don't know how to approach this.

@DullReferenceException
Copy link

I'll be doing a bunch of testing of your published fork within our app soon; maybe that will be sufficient validation. Will report back.

@sogaani
Copy link

sogaani commented Sep 4, 2019

With current commit, I could not reproduce the error @maritz reported.
But I figured out 2 possible reasons that caused the error.

  • First possible reason (The error found with Travis-CI)
    Till this commit, the http2 test closes http2 stream when client get header frame ('response' is emmited when header frame reaches client).
    request.on('response', function (headers) {
    assert.equal(headers['content-encoding'], 'gzip')
    closeHttp2(request, client, server, done)
    })

    Since Zlib compression processes asynchronous, the client possibly gets the header frame and close the stream before sending a body.
    I surmised that @maritz occasionally saw the above issue and it has been solved.
    100% reproduce this issue with the following code.
var server = createHttp2Server({ threshold: 0 }, function (req, res) {
    res.setHeader('Content-Type', 'text/plain')
    res.writeHead(res.statusCode)
    setImmediate(()=>{res.end()})
})

server.on('listening', function () {
    var client = createHttp2Client(server.address().port)
    var request = client.request({
        'Accept-Encoding': 'gzip'
    })
    request.on('response', function (headers) {
        closeHttp2(request, client, server, ()=>{})
    })
    request.end()
})
  • Second possible reason (The issue found with next.js)
    When the server returns 304(not modified) or 204(no content), the error possibly happens with Compression.
    When server returns statusCode 304 or 204, node http2 module starts to close stream. After that, seemingly call res.end() is safe. However, with compression, res is wrapped with Zlib stream and Zlib stream emit data even thought empty input (which is not a bug). So that, a server calls res.end() which is wrapped Zlib stream, eventually calls res.write(chunk) and it possibly causes the error.

    With http1, If already set statusCode as 304 or 204, http_outgoing_message.write(chunk) checks if it has body and [do nothing] (https://github.com/nodejs/node/blob/master/lib/_http_outgoing.js#L619-L624).
    With http2 compat layer, just throw an error.

    next.js serves static file with send package. Send package check if file is cached and return 304(not modified) and call res.end().

    100% reproduce this issue with the following code.

var server = createHttp2Server({ threshold: 0 }, function (req, res) {
    res.setHeader('Content-Type', 'text/plain')
    res.writeHead(304)
    setImmediate(()=>{res.end()})
})

server.on('listening', function () {
    var client = createHttp2Client(server.address().port)
    var request = client.request({
        'Accept-Encoding': 'gzip'
    })
    request.end()
})

Fixing the second issue, I think there are two approaches.
A. check the status code in this package.
B. check the status code in nodejs http2 compat layer.
I prefer approch B, because of keeping compatibility with http1.

Any thoughts?

@wood1986
Copy link

do you have any ETA to get this merged?

@maritz
Copy link
Author

maritz commented Jan 19, 2020

I have re-enabled compression on my server again with a current node version (v12.14.1) and have performed some long-running load tests on a locally running next.js server serving html (200 status code) with >1m of compressed requests. I have also done a lot of manual loading of 200 and 304 requests with gzip compression on.

So far no crashes. I don't know if something changed in node or whether I've just been unlucky in reproducing it this time. I vote for merging and then dealing with potential issues if they arise.

Since all failures I've personally seen were with http2 anyways - not http1 - this would still be an improvement over the status quo of no one really using compression over http2 in the express-based world.

Otherwise, if anyone is interested in testing it without an official merge, you can use '@maritz/compression' from npm and report back how it works out for you.

Copy link
Contributor

@dougwilson dougwilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for requesting my review. At the time, I was hoping that @sogaani 's comments would be answered prior to merging and then... promptly forgot to follow up here 😆 I'm taking from your response that there is nothing to do from his comments, so just reviewed the tests that were added and also wanted to note I do have a "rebase" label on this issue, as it is currently merge conflicts preventing merge. If you have any follow up questions / comments I will be back around tomorrow to follow up 👍

var _compression = compression(opts)
var server = http2.createServer(function (req, res) {
req.on('error', function (error) {
console.error('An error event occurred on a http2 request.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here vs just letting an error crash the process? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

console.error('An error event occurred on a http2 request.', error)
})
res.on('error', function (error) {
console.error('An error event occurred on a http2 response.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here vs just letting an error crash the process? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

})
})
server.on('error', function (error) {
console.error('An error event occurred on the http2 server.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here vs just letting an error crash the process? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

console.error('An error event occurred on the http2 server.', error)
})
server.on('sessionError', function (error) {
console.error('A sessionError event occurred on the http2 server.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

function createHttp2Client (port) {
var client = http2.connect('http://127.0.0.1:' + port)
client.on('error', function (error) {
console.error('An error event occurred in the http2 client stream.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here vs just letting an error crash the process? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

// force existing connections to time out after 1ms.
// this is done to force the server to close in some cases where it wouldn't do it otherwise.
server.setTimeout(1, function () {
console.warn('An http2 connection timed out instead of being closed properly.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the comment says "force existing connections to time out after 1ms.", but inside the timeout there is nothing but a console warn. Should there be something being don in here? Otherwise, what purpose does the timeout serve? If to fail the test, should an error be passed to callback instead to make the test fail?

assert.equal(headers['content-encoding'], 'gzip')
})
request.on('error', function (error) {
console.error('An error event occurred on a http2 client request.', error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No one is going to comb through the CI logs after every run or upgrade... do these need to be here vs just letting an error crash the process? If they are preventing an expected error, it shouldn't be logging to the screen, as there should be no extra output when everything was actually OK.

console.error('An error event occurred on a http2 client request.', error)
})
request.on('data', function (chunk) {
// no-op without which the request will stay open and cause a test timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this assert that the data was actually correct?

@maritz
Copy link
Author

maritz commented Jan 21, 2020

At the time, I was hoping that @sogaani 's comments would be answered prior to merging

Yeah, I'll try to see what I can do about that. I obviously can't really make a decision regarding whether to check for 304 in the module or in node compat. But I can do some more tests with old node versions etc. to see if the 100% reproducible cases can be reasonably added to theses tests.

wanted to note I do have a "rebase" label on this issue, as it is currently merge conflicts preventing merge

My bad, I saw that at some point and then forgot about it later on. :-D

Regarding the test reviews: Yep, those logs and no-ops are suboptimal. I think I added the logs simply for the case that we do see the unexpected failures and they could be caught with this. But it's been a long time, I'll go over all the tests and make sure they are done properly.

@Icehunter

This comment has been minimized.

@dougwilson

This comment has been minimized.

@maritz
Copy link
Author

maritz commented Feb 12, 2020

If someone else wants to take over, of course they are free to do so. I can't promise any time frame about when I'll get around to this, sorry.

@jonchurch
Copy link
Member

@maritz You can check a box I believe bottom right of the PR to allow maintainers to push to this PR, if youd like to leave the work open to updates.

@maritz
Copy link
Author

maritz commented Feb 14, 2020

@jonchurch Thanks, already checked by default, it seems. 👍

@Icehunter
Copy link

@maritz @jonchurch I can't rebase this PR however master...Icehunter:bugfix/use_writeHead_instead_of_implicitHeader I did fork @maritz 's fork; and then rebased from express master. How can I help with this?

@jonchurch
Copy link
Member

@Icehunter You can either submit a PR to OP's branch https://github.com/maritz/compression/tree/bugfix/use_writeHead_instead_of_implicitHeader or create a new PR here

@lukasoppermann
Copy link

lukasoppermann commented Aug 5, 2020

Hey, I am running into this issue as well.

It seems @Icehunter has this all already addressed apart from one edge case? Is there a way to help here and make this merge happen, as we are otherwise at a loss if we don't use express. 😢

@lamweili
Copy link

lamweili commented Aug 2, 2022

Superseded by #170.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants