Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No error thrown when there is garbage data past end-of-frame #181

Open
embg opened this issue Sep 16, 2022 · 1 comment
Open

No error thrown when there is garbage data past end-of-frame #181

embg opened this issue Sep 16, 2022 · 1 comment

Comments

@embg
Copy link

embg commented Sep 16, 2022

On 0.17.0, the following example doesn't throw an error:

>>> zstandard.decompress(zstandard.compress(b"foo") + b"garbage")
b'foo'

Perhaps debatable, but I think an error should always be thrown if there is data past the last valid frame in a stream. Especially after fixing #59, I don't think it makes sense to silently ignore the garbage past end-of-frame. Users should be notified if they are constructing invalid streams.

Thanks @thatch for identifying this issue.

@indygreg
Copy link
Owner

Ignoring junk data after a valid zstd frame is intended for the decompressobj() API, as that is what the stdlib API does.

But this report is for the ZstdDecompressor.decompress() API, which is different.

I agree the one-shot API should probably reject garbage past end-of-frame. The API docs even say this:

The input bytes are expected to contain a full Zstandard frame. ... If the input does not contain a full frame, an exception will be raised.

So I consider this a bug.

Hyrum's Law says someone is likely relying on the buggy behavior. So we may want to provide an argument to maintain the existing behavior, just in case.

indygreg added a commit that referenced this issue Oct 29, 2022
Its behavior around multiple frames and extra input was under-specified.
Let's explicitly call out the currently implemented behavior.

Related to #59 and #181 and probably other issues.
indygreg added a commit that referenced this issue Oct 29, 2022
Related to #59 and #181. The existing behavior isn't great. But we
should at least have tests to keep us honest if/when behavior changes.
indygreg added a commit that referenced this issue Oct 29, 2022
…o decompress()

This is related to #59 and #181.

I'm not willing to implement read_across_frames at this time, as it is
non-trivial. But we can implement a placeholder to provide future
compatibility with the existing feature.

allow_extra_data was trivial to implement, so it works as advertised.
indygreg added a commit that referenced this issue Oct 29, 2022
…o decompress()

This is related to #59 and #181.

I'm not willing to implement read_across_frames at this time, as it is
non-trivial. But we can implement a placeholder to provide future
compatibility with the existing feature.

allow_extra_data was trivial to implement, so it works as advertised.
indygreg added a commit that referenced this issue Oct 29, 2022
Its behavior around multiple frames and extra input was under-specified.
Let's explicitly call out the currently implemented behavior.

Related to #59 and #181 and probably other issues.
indygreg added a commit that referenced this issue Oct 29, 2022
Related to #59 and #181. The existing behavior isn't great. But we
should at least have tests to keep us honest if/when behavior changes.
indygreg added a commit that referenced this issue Oct 29, 2022
…o decompress()

This is related to #59 and #181.

I'm not willing to implement read_across_frames at this time, as it is
non-trivial. But we can implement a placeholder to provide future
compatibility with the existing feature.

allow_extra_data was trivial to implement, so it works as advertised.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants