Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip=n argument for parallelization? #75

Open
stevengj opened this issue Mar 22, 2023 · 3 comments
Open

skip=n argument for parallelization? #75

stevengj opened this issue Mar 22, 2023 · 3 comments

Comments

@stevengj
Copy link

It would be nice to support parallelization, simply by exposing the skip feature of generators like Sobol.jl. That is, if you want to evaluate on 1000 processors with 2^15 points each, you could pass skip=2^15 * i on processor i, and then average the results.

@dmetivie
Copy link
Contributor

If I understand correctly, you want to use the

  • first points 1:2^15 on processor 1
  • 2^15+1:(2^15)*2 on processor 2
  • etc...

then average the result.

I believe this is not recommended for Sobol sequences. There are no guarantees that block (2^15)*i+1:(2^15)*i+1 forms a (tms)-net i.e. has low discrepancy -> good variance reduction.

In qmcpy and scipy.stats.qmc.Sobol this is forbidden. I quote the the warning in the link:

Sobol’ sequences are a quadrature rule and they lose their balance properties if one uses a sample size that is not a power of 2, or skips the first point, or thins the sequence [5].

You can just take powers of 2.
IMO, the Owen paper is a must-read to understand what not to do with Sobol sequence.

@ParadaCarleton
Copy link
Collaborator

I believe this is not recommended for Sobol sequences. There are no guarantees that block (2^15)*i+1:(2^15)*i+1 forms a (tms)-net i.e. has low discrepancy -> good variance reduction.

I think the request is asking for a way to partition one long Sobol sequence into separate chunks, that can be run in parallel on separate cores. So, for example, if you'd like to integrate over a net of size 256, you could have each core handle 85 or 86 points.

This should be fine, as long as only the accuracy of the final calculation matters (i.e. the final calculation is done by averaging over all the points). You'd only have problems if you needed each core's calculation to be individually accurate.

@stevengj
Copy link
Author

Right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants