Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zstd multithreaded output can depend on number of threads #2327

Closed
2 tasks
terrelln opened this issue Sep 25, 2020 · 3 comments
Closed
2 tasks

Zstd multithreaded output can depend on number of threads #2327

terrelln opened this issue Sep 25, 2020 · 3 comments
Assignees
Labels

Comments

@terrelln
Copy link
Contributor

terrelln commented Sep 25, 2020

Describe the bug
As reported by @animalize in Issue #2238:

When using ZSTD_e_end end directive and output buffer size >= ZSTD_compressBound() the job number is calculated by ZSTDMT_computeNbJobs() function. This function produces a different number of jobs depending on nbWorkers:

static unsigned
ZSTDMT_computeNbJobs(const ZSTD_CCtx_params* params, size_t srcSize, unsigned nbWorkers)
{
assert(nbWorkers>0);
{ size_t const jobSizeTarget = (size_t)1 << ZSTDMT_computeTargetJobLog(params);
size_t const jobMaxSize = jobSizeTarget << 2;
size_t const passSizeMax = jobMaxSize * nbWorkers;
unsigned const multiplier = (unsigned)(srcSize / passSizeMax) + 1;
unsigned const nbJobsLarge = multiplier * nbWorkers;
unsigned const nbJobsMax = (unsigned)(srcSize / jobSizeTarget) + 1;
unsigned const nbJobsSmall = MIN(nbJobsMax, nbWorkers);
return (multiplier>1) ? nbJobsLarge : nbJobsSmall;
} }

Expected behavior
The output of zstd multithreaded compression must be independent of the number of threads.

Fix

  • Make ZSTDMT_computeNbJobs() independent of nbWorkers.
  • Add a fuzz test that checks that the output of multithreaded zstd is always independent of the number of threads.

Workaround
If you need to work around this bug, don't start your streaming job with ZSTD_e_end. Pass at least one byte of input with ZSTD_e_continue before calling ZSTD_e_end, or ensure your output buffer is < ZSTD_compressBound(inputSize).

@Cyan4973
Copy link
Contributor

Cyan4973 commented Sep 25, 2020

It's a shortcut to say that the outcome of multithreaded zstd does not depend on nb of threads.

Actually, the feature supported is that the outcome of streaming multithreaded zstd does not depend on nb of threads
(and that's what is used by the zstd CLI).

This definition makes it possible to consider another potential fix :
do not employ the one-pass shortcut for ZSTD_e_end when nbWorkers >= 1,
since it's the delegation to the one-pass mode which triggers this issue.

This could be less disruptive than trying to adapt the single-pass MT compressor,
which was never designed to offer this guarantee.

Another (potentially positive) side effect is that it would guarantee that streaming multithreaded compression is always non-blocking, since it would no longer delegate to the (blocking) single-pass mode.
edit : scrap that, no longer delegating to the single-pass mode doesn't guarantee non-blocking, since on receiving ZSTD_e_flush and ZSTD_e_end directive, the MT API contract changes from minimal forward progress to maximal progress.

@ghost
Copy link

ghost commented Sep 25, 2020

Another (potentially positive) side effect is that it would guarantee that streaming multithreaded compression is always non-blocking, since it would no longer delegate to the blocking mode.

I once wanted to propose adding a ZSTD_compressStream3() function, that is always blocking in multithreaded compression.

If the caller keeps checking the non-blocking progress, it's very inconvenient.

edit: Just found, checking the progress is not very inconvenient:

do {
    zstd_ret = ZSTD_compressStream2(self->cctx, &out, &in, ZSTD_e_continue);
} while (out.pos != out.size && in.pos != in.size && !ZSTD_isError(zstd_ret));

But it's better to have an always blocking ZSTD_compressStream3(), it may be faster a bit, IMO many programmer users don't need to get the compression progress.

@terrelln
Copy link
Contributor Author

This could be less disruptive than trying to adapt the single-pass MT compressor,
which was never designed to offer this guarantee.

Yeah, that is probably easier. I had forgotten that all the jobs in the single pass MT compressor needed to be launched at once.

I once wanted to propose adding a ZSTD_compressStream3() function, that is always blocking in multithreaded compression.

Generally, the way people write streaming compression loops, it shouldn't be terribly inconvenient to not make maximal forward progress. If we were to add something like this, it wouldn't require a new API. We'd probably just need to add a compression parameter to control it. But, I don't currently see a great need for it.

terrelln added a commit to terrelln/zstd that referenced this issue Oct 2, 2020
Simplifies the code and removes blocking from zstdmt.

At this point we could completely delete
`ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because
I think we want to do that in the zstd-1.5.0 release, in case anyone is
still using the ZSTDMT API, even though it is not installed by default.

Fixes facebook#2327.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants