Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress the source when posting to build server #85

Open
foolip opened this issue Jan 29, 2016 · 14 comments
Open

Compress the source when posting to build server #85

foolip opened this issue Jan 29, 2016 · 14 comments
Assignees

Comments

@foolip
Copy link
Member

foolip commented Jan 29, 2016

Just uploading the source accounts for the majority of the build time when you're on a slow network.

https://tools.ietf.org/html/rfc2388#section-5.1 says "do it yourself", so I see two options:

  1. Add a sourcegz field or something that takes the compressed source instead.
  2. Make the build server use HTTPS, and verify that the negotiated connection uses compression.

I think HTTPS makes sense, for other obvious reasons as well.

@foolip
Copy link
Member Author

foolip commented Jan 29, 2016

@domenic?

@domenic domenic self-assigned this Jan 29, 2016
@domenic
Copy link
Member

domenic commented Jan 29, 2016

Yeah, I've been meaning to add HTTPS for a while, especially since AWS apparently makes that easy now.

@foolip
Copy link
Member Author

foolip commented Jan 30, 2016

Another thing one could do is to assume that the build server has a copy of the html repo, and only send a diff against the merge-base with whatwg/html. A bit elaborate, of course, and the next bottleneck would be sending the output back.

@zcorpan
Copy link
Member

zcorpan commented Jan 30, 2016

Could we use dropbox for this, or some similar service? https://www.dropbox.com/help/8

@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

Unfortunately, as tested with Wireshark in whatwg/html-build#64 (comment) there is no automatic compression at the TLS layer.

@foolip foolip reopened this Feb 3, 2016
@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

So, just to break this down and see where the low hanging fruit is, here's what we send:

  • source-whatwg-complete is 6055 kB and gzips to 1158 kB
  • caniuse.json is 980 kB and gzips to 135 kB
  • w3cbugs.csv is 245 kB and gzip to 96 kB

The returned wattsi-output.zip is 4218 kB and is already compressed using Defl:N (so says unzip -lv), which is the same algorithm that gzip uses.

Potential compression in total is from ~7.2 MB to ~1.4 MB, and we only need to fiddle with the posted data.

@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

As for the imagined automatic compression of TLS, a colleague has educated me, and while it's part of the protocol it's been turned off in browsers because of the security issues, "adaptive chosen plaintext attacks." Even if we can get it to work with curl, it's probably not a long-term safe bet.

So, @domenic, do you think you could add support for a .gz variant of each field? Unless someone knows of a way to get the whole request body compressed at the HTTP level, since TLS is the wrong level. Maybe HTTP2 can do it?

@haavardmolland
Copy link

The TLS spec does support compression, but is considered bad as it opens up for attacks. Compression attacks belong in "adaptive chosen plaintext" category though, which means you need to somehow control part of what the agent sends. This is hard to conduct, and mostly browsers are vulnerable. However, since it's considered harmful I wouldn't use it, as it is or will most likely be turned off in whatever TLS stack you are using.

@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

@haavardmolland informs me that HTTP2 compresses (almost) everything, so that would be an option. curl has a --http2 option that presumably works at least some of the time. @domenic, does your build server know how to speak HTTP2?

@domenic
Copy link
Member

domenic commented Feb 3, 2016

I would love to make my build server http2 aware. I will try to figure that out over the next day or two.

@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

I've noticed that even though my curl binary has a --http2 option, it actually fails with curl: (1) Unsupported protocol when I try to use it. The curl binary from Homebrew has the same problem.

This would limit compression to very bleeding edge curl installs, if it's ever enabled by default. Not sure how to deal with this, seems like compressing ourselves would be the only way to benefit most users of html-build :(

@domenic
Copy link
Member

domenic commented Feb 3, 2016

Are you sure it's not just failing because my server doesn't support that protocol?

@foolip
Copy link
Member Author

foolip commented Feb 3, 2016

I'm pretty sure, yes, because it fails fast and running with verbose output it doesn't seem to even try connecting. curl -V also doesn't list it as a supported protocol:

curl 7.43.0 (x86_64-apple-darwin15.0) libcurl/7.43.0 SecureTransport zlib/1.2.5
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz UnixSockets

Some relevant web search finds:
https://curl.haxx.se/docs/http2.html
Homebrew/legacy-homebrew#36942

@domenic
Copy link
Member

domenic commented Feb 4, 2016

OK. That is very sad. But I can work on adding support for a zip body instead of a multipart body then; simple enough. I will probably do switching on Content-Type first (application/zip => zip path) and then after a few weeks remove support for non-zip. Alternately I could add a new endpoint (/v2/wattsi or /wattsi-zipped or similar) but it's probably not worth worrying about at this point.

I do wish we had whatwg/html-build#53 in place though.

@domenic domenic transferred this issue from whatwg/html-build Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants