Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DE, DEzs, DREAM, DREAMzs: settings$nrChains = x results in 2x or more chains #224

Open
twest820 opened this issue Jul 9, 2021 · 1 comment

Comments

@twest820
Copy link

twest820 commented Jul 9, 2021

I wanted to suggest some possible documentation improvements based on observations of unexpected behavior and some looking through these samplers' source code.

For context, I'm evaluating several variations of the runMCMC call below and noticed trace plots were showing more chains than nrChains. Calling coda package methods also shows a mismatch between nrChains (3 in this example) and the number of chains returned (9).

mcmcBurnin = runMCMC(bayesianSetup = mcmcSetup,
                     sampler = "DEzs",
                     settings = list(iterations = 2000, nrChains = 3)) # vignette recommends nrChains = 3
chainsInCodaFormat = getSample(mcmcBurnin, coda = TRUE)
tibble(iterations = niter(chainsInCodaFormat), variables = nvar(chainsInCodaFormat), chains = nchain(chainsInCodaFormat))
# A tibble: 1 x 3
  iterations variables chains
       <int>     <int>  <int>
1        667         1      9

Looking in the BayesianTools sources, these samplers seem set up to yield Npop * nrChains chains, where Npop is the number of rows in the initialization matrix X. In the default case where settings$startValue isn't populated, X is set by these lines of code.

X = bayesianSetup$prior$sampler(3 * parLen) # DE yields 3 nrChains when parLen = 1
X = bayesianSetup$prior$sampler(3) # DEzs
X = bayesianSetup$prior$sampler(3) # DREAMzs

There is also

X = bayesianSetup$prior$sampler(max(4,2 * parLen)) # DREAM defaults to either 2x or 4x nrChains

From looking through the journal papers on DE, DEzs, DREAM, and DREAMzs I think the way BayesianTools is designed is that

  1. The samplers are set up internally to default to the number chains recommended by the authors of the corresponding paper.
  2. At least for these implementations of differential evolution MCMC, settings$nrChains is not actually the intended number of chains. It's the number of populations.

If I'm understanding correctly, I think what this indicates is settings$nrChains = 1 is a typical default for these samplers and, if it's desired to have a number of chains different from the algorithm defaults, manipulation of settings$startValue may be more desirable than changing nrChains.

Apologies if I've missed something in the package documentation but, assuming the above interpretation is correct, it seems to me it may be helpful to

  1. Revise the description of nrChains in the documentation for runMCMC().
  2. Include nrChains in the settings documentation for DE, DEzs, DREAM, DREAMzs, and similar.
  3. Consider using a name other than nrChains in the settings objects for population based samplers.

A related consideration is these four samplers all calculate the length of a chain as

n.iter <- ceiling(settings$iterations/Npop)

but the current documentation for iterations is typically worded in ways which suggests it controls the total number of times likelihood is evaluated across all chains rather than for a single population of chains. Since Npop isn't readily user visible and it happens to be that Npop = nrChains = 3 appears to be a recommended configuration for some of these samplers, it's easy for a user to mistakenly conclude the chains are of length ceiling(settings$iterations / settings$nrChains). There's also potential for confusion over the total number of likelihood evaluations being nrChains * iterations rather iterations.

@twest820 twest820 changed the title DE, DEzs, DREAM, DREAMzs: settings$nrChains = x results in 2+x chains DE, DEzs, DREAM, DREAMzs: settings$nrChains = x results in 2x or more chains Jul 9, 2021
@florianhartig
Copy link
Owner

florianhartig commented Jul 9, 2021

This is all expected / intended behaviour, although it may be true that documentation can be improved.

The DE sampler family are so-called population MCMCs, that is, if you do one MCMC run, the MCMC runs several internal chains. Note, for example, that starting conditions for all DE family samplers is a matrix, not a single parameter vector.

The argument nrChains in the runMCMC function, in retrospect, should have probably been called nrIndependentMCMCruns or something like that, because it determines how many independent MCMCs are run, which is 3 in your example.

I know that ‪Cajo ter Braak has suggested that it is sufficient to run only on DE MCMC, and check convergence of the internal chains, based on the proof in one of his papers that the internal chains in the DE algorithms are independent when the sampler is equilibrium.

However, when working with the DE MCMCs in practice, you will notice that internal chains in the DE family often show cross-correlation. I don't actually know what the exact reason for this is, but we set up the starting conditions and Z matrix from the prior, which is substantially wider than the posterior, but we still often observer that the internal chains are more similar than the independent MCMC runs.

Based on this, my / our recommendation (which I think is also noted somewhere in the help) is to not treat the internal chains of the DE family as independent MCMC chains.

From this logic, we derive the following behaviour:

  • nrChains = independent chains
  • plots will show nrChains * internalChains, but note that the internal chains are not independent, so there are always several chains that belong to the same MCMC and should not be viewed as independent.
  • for iterations, we calculate the total model evaluations of the MCMC, so we have per internal chain iterations / nrInternalChains. Again, I think this is desirable, as the internal chains are merged later anyway, and in this way the number of likelihood evaluation does not change if the number of internal chains of the MCMC is changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants