Skip to content

Latest commit

 

History

History
30 lines (15 loc) · 6.48 KB

FAQs.md

File metadata and controls

30 lines (15 loc) · 6.48 KB

Technical FAQs

This page lists some common questions on how to use DISCOTRESS effectively. The citations referenced here correspond to those listed on the main page. Remember to refer to the tutorials if you are stuck.

Which sampling method should I use for my simulation?

DISCOTRESS divides its dynamical simulation methods into two classes; "wrapper" methods handle an ensemble of so-called walkers (independent trajectories) via a division of the state space (e.g. WE sampling, milestoning, or no special method), and "trajectory" methods deal with propagating an individual trajectory (BKL, KPS, MCAMC).

For Markov chains that are nearly reducible, which commonly arise when constructing a network that represents a realistic dynamical process of interest, the standard simulation algorithm (BKL) used alone will most likely be too inefficient to simulate the dynamics, because of "flickering" within long-lived (i.e. metastable) macrostates. It becomes necessary to use methods for the simulation of trajectories that are unaffected by metastability (KPS, MCAMC), and/or to employ an enhanced sampling methodology to handle an ensemble of walkers simulated in parallel. Trajectory segments from the walkers can then be stitched together, with appropriate weighting, to yield complete 𝔄 ← 𝔅 paths. To choose the appropriate enhanced sampling algorithm, there are many factors to consider, including the information that is desired from the simulation. For instance, milestoning and an adaptation of WE sampling are used to simulate the equilibrium (steady state) ensemble of 𝔄 ← 𝔅 transition paths [1], which can also be achieved by simulating a very long trajectory that continually transitions between the 𝔄 and 𝔅 states [2,3]. The other simulation methods compute dynamical quantities for the nonequilibrium ensemble of 𝔄 ← 𝔅 paths (i.e. first hitting problem with respect to the initial probability distribution). The characteristics of the topology and dynamics of the Markov chain will also influence which enhanced sampling method is the best choice. For a thorough discussion concerning the choice of sampling method, see [1].

In general, kPS [1,2] provides a highly efficient method to simulate the dynamics for metastable DTMCs and CTMCs.

The state reduction algorithms for the exact computation of dynamical properties also determine quantities associated with the nonequilibrium TPE, although equilibrium TPE properties can be subsequently derived [3]. The state reduction algorithms can be used to characterise the 𝔄 ← 𝔅 process at both a microscopic and macroscopic level of detail, but are typically feasible only for Markov chains of no more than several thousand nodes. Simulation offers a more scalable approach to estimate the same properties via sampling of trajectories.

How should I obtain a predefined partitioning of the Markov chain for use with the enhanced sampling simulation algorithms?

All of the enhanced sampling methods described in [1] are based on a partitioning of the network into communities, which must accurately characterise the metastable macrostates of the system. Be aware that the choice of this community structure strongly affects the efficiency of the simulation. The question of how to obtain this partitioning is therefore of critical importance.

Even for small networks, spectral methods (such as the original and robust Perron cluster cluster analysis methods, PCCA and PCCA+, respectively) will fail owing to numerical instability if the Markov chain is metastable. Hence, for such networks, BACE (the Bayesian Agglomerative Clustering Engine) is a favourable alternative, but is not scalable. More scalable community detection algorithms are usually stochastic, but such algorithms may misclassify nodes close to the boundary between metastable macrostates. This problem can be attenuated by variational refinement of an initial clustering [6]. Unfortunately, many state-of-the-art community detection algorithms are based on optimisation of the modularity objective, and therefore are liable to misrepresent the dynamics. Multi-level regularised Markov clustering (MLR-MCL), an implementation of which is available here, provides a suitable alternative [1].

If I have determined metastable communities of nodes in the Markov chain, how do I determine a reduced Markov chain?

The simplest way to perform dimensionality reduction of a Markov chain, for which the metastable macrostates have already been determined, is to use the local equiblirium approximation to obtain transition probabilities or rates between the communities of nodes [9]. However, this method is sensitive to the precise definition of the community boundaries and performs poorly if the communities are not strongly metastable. The optimal choice of coarse-grained Markovian transition probabilities or rates can be determined from knowledge of the matrix of MFPTs between all pairs of nodes [7]. For large Markov chains, where this method is not feasible, a reduced Markov chain can be estimated accurately from trajectory data using maximum-likelihood or Gibbs sampling approaches [6].

I want my simulation to go faster!

To achieve efficient dynamical simulations using enhanced sampling methods, the community structure must accurately characterise the metastable sets of nodes. It is therefore often worth exploring different algorithms and varying choices of parameters to obtain a suitable partitioning of the Markov chain. There are various rigorous measures to determine the quality of a partitioning [6]. A common mistake when using the kPS and MCAMC algorithms is to not run a fixed number of standard BKL steps after basin escape iterations.

It is also possible to pre-process the Markov chain to reduce dimensionality and eliminate flickering, while maintaining an accurate representation of the slow dynamics for the original Markov chain. A simple method for this purpose is to recursively regroup states according to a specified transition rate or probability threshold [9]. A more complex approach is to renormalise the Markov chain following the elimination of appropriately chosen nodes, which has the advantage of preserving the mean and variance of the 𝔄 ← 𝔅 FPT distribution if only nodes outside of the 𝔄 and 𝔅 sets are eliminated [8].