Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Footgun with the compressobj() API #182

Open
embg opened this issue Sep 16, 2022 · 2 comments
Open

Footgun with the compressobj() API #182

embg opened this issue Sep 16, 2022 · 2 comments

Comments

@embg
Copy link

embg commented Sep 16, 2022

The fact that you can create multiple compressobj from a single ZstdCompressor allows for foot-guns such as:

a = zstandard.ZstdCompressor()

b = a.compressobj()
c = b.compress(b"prefix")

d = a.compressobj()
e = d.compress(b"foo")
e += d.flush(zstandard.FLUSH_BLOCK)

c = b.compress(b"foo")
c += b.flush(zstandard.FLUSH_BLOCK)

assert(c != e) # Assertion fails!

The API should protect users from interleaving usages of two compressobj. This could be accomplished via a counter that is atomically incremented in the ZstdCompressor. The compressobj would know what count it was created on and throw an error if compress() or flush() are called after the counter in the parent ZstdCompressor is incremented.

Thanks @thatch for identifying this issue and proposing the fix.

@indygreg
Copy link
Owner

We might be able to add some checks here. But doing this comprehensively is difficult to impossible. e.g. sometimes an application may want to abort an in-progress (de)compression operation while still preserving the (de)compressor instance for reuse. This is totally valid!

I think the best we can do in the short term is better document that temporally overlapping usage won't work and will lead to runtime errors.

@embg
Copy link
Author

embg commented Oct 29, 2022

Thanks so much for adding the docs in 6187659!

I'm a little confused about the example you gave. Why preserve the compressobj instance for re-use rather than preserve the ZstdCompressor and create a new compressobj for each compression? Isn't the compressobj a very thin wrapper?

I agree with you that there isn't a perfect solution :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants