Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gz failing to process #126

Open
M-Gonzalo opened this issue Jan 7, 2021 · 4 comments · May be fixed by #137
Open

gz failing to process #126

M-Gonzalo opened this issue Jan 7, 2021 · 4 comments · May be fixed by #137

Comments

@M-Gonzalo
Copy link

M-Gonzalo commented Jan 7, 2021

(0.00%) Possible zLib-Stream in GZIP found at position 0
Compressed size: 6406001
Can be decompressed to 18424752 bytes
No matches
New size: 17661029 instead of 17661001     

Done.
Time: 5 second(s), 893 millisecond(s)

Recompressed streams: 0/1
GZip streams: 0/1

cloudflared.tar.gz

@schnaader schnaader self-assigned this Jan 8, 2021
@schnaader schnaader added this to the Precomp v0.4.8 milestone Jan 8, 2021
@schnaader
Copy link
Owner

Very interesting, seems to be a gzip and preflate related bug. Precomp 0.4.6 (the last version not using preflate) showed the correct compressed and decompressed size (17,660,983 -> 37,623,808), but couldn't proceed further (as expected). These sizes are correct and can be confirmed by looking at the file with 7-Zip.

Precomp 0.4.7 and 0.4.8dev (using preflate) show incorrect sizes (6,406,001 -> 18,424,752) like you posted - so it seems to detect only a part of the data. As the gzip header doesn't contain size information, this most likely is some bug related to the deflate parsing in preflate.

As a workaround, "-brute" manages to process some other parts and gives a slightly better result ("-cn -brute" -> 31,158,148 bytes, "-cl -brute" -> 16,748,698 bytes).

I will investigate this further and depending on the outcome (if it really is a bug in preflate), I might create an additional issue in the preflate repository.

@as-com
Copy link

as-com commented Apr 25, 2022

I have another valid gzip file that precomp cannot process. It appears to be produced by Go's gzip implementation.

Here's the gzip file: https://drive.google.com/file/d/1VAK8e9aYpTREjYIvaQ9dDu8oLdcaNmoK/view?usp=sharing

And here's the raw deflate stream: https://drive.google.com/file/d/1RXQdupGnuDpfbPEw5t_MGa4AkwE69v8P/view?usp=sharing

I attempted to get preflate itself to process the (valid!) deflate stream, but it simply says "failed." I believe there is an issue with preflate.

@as-com
Copy link

as-com commented Apr 25, 2022

I've narrowed down the issue to preflate's home-grown Huffman coding implementation not liking at least one of the Huffman trees that are included in the deflate stream.

The problem exists in this function: https://github.com/deus-libri/preflate/blob/609eefaa96ac6c51d7b1a3fb29e0ed94d0f3623e/support/huffman_helper.cpp#L20

It runs all the way to the bottom of the function, where the return statement evaluates to false. The expression seems suspect to me, but I don't have time right now to compare it with the spec/debug it further.

@as-com
Copy link

as-com commented Apr 29, 2022

I found a fix, PR in #137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants