Skip to content

Tutorial 3

Tim Vaughan edited this page May 19, 2015 · 18 revisions

The previous tutorials have focused on the stochastic simulation of population size dynamics. We now turn to the second major application of MASTER: inheritance graph simulation. Simply put, MASTER allows one to combine the "reaction" description of a stochastic population process with information about which reactant individual parents which product individual. This additional information can then be used to simulate trees or graphs representing the inheritance relationships between a subset of the individuals in the model, in conjunction with the simulation of the population size dynamics.

To illustrate this, we use the stochastic SIR model from epidemiology. This model involves a population of humans or animals divided into three compartment: the S compartment containing individuals currently susceptible to a contagious disease, the I compartment containing individuals currently infected with the disease, and the R compartment containing individuals which have recovered from the disease. Individuals in these compartments are subject to two distinct processes. Firstly, the infection process takes place at rate and requires interaction between a susceptible and an infected individual and produces a second infected individual:

Secondly, the recovery process occurs at rate and involves an infected individual spontaneously recovering and moving to the R compartment:

Note that while the above reactions are sufficient to describe the stochastic dynamics of the population sizes, they in no way specify which (if any) reactants are parents of each of the products. For instance, should we consider the incoming I and S individuals to both be parents of the newly infected individual? In terms of the animals involved, it may make sense to consider a continuous lineage crossing from S to I at this point, seeing as the conversion represents a single animal becoming infected. In terms of the pathogen, however, it may instead make sense to consider the I reactant as the parent of both infected products, as this represents the path of infection transmission.

Due to this ambiguity, it is clear that additional information beyond the given reaction scheme is required in order to construct an inheritance tree or graph. We show how to specify this information in MASTER below.

Input file

To generate an inheritance graph or tree, we make use of the InheritanceTrajectory calculation type. The outer XML elements are:

<beast version='2.0' namespace='master:master.model:master.steppers:master.conditions:master.postprocessors:master.outputs'>
  <run spec='InheritanceTrajectory'
       samplePopulationSizes="true">

  </run>
</beast>

The inheritance trajectory <run> element takes similar attributes to that of the regular trajectory calculation type. The difference here is the inclusion of the samplePopulationSizes attribute which, when set to "true", causes MASTER to record population size dynamics in addition to the inheritance graph.

Model

The model specification is

<model spec='Model' id='model'>
  <population spec='Population' id='S' populationName='S'/>
  <population spec='Population' id='I' populationName='I'/>
  <population spec='Population' id='R' populationName='R'/>
      
  <reaction spec='Reaction' reactionName="Infection" rate="0.005">
    S + I -> 2I
  </reaction>
  <reaction spec='Reaction' reactionName="Recovery" rate="0.2">
    I -> R
  </reaction>
</model>

As discussed above, the reaction scheme on its own does not uniquely identify inheritance relationships between reactants and products. In the absence of any additional information, reactions automatically designate the first reactant appearing on the left-hand side of a given reaction as the sole parent of all products of the same population type. The same assignment is then completed for the second reactant and so-on, until all reactants are depleted. Reactants which are not assigned any children are taken as lineage termination points, while products that have no parents will produce new lineages.

This is the approach which is used in the model specification above. It causes the infection reaction to result in the termination of a susceptible lineage and a branching of the lineage belonging to the infected reactant individual into the two product infected individuals. Thus, the above model can be used to simulate a disease transmission tree under the SIR model.

Initial state

The initial state is given specified using a very similar form to that used in the context of the non-inheritance calculations. The major difference, however, is that along with setting the non-zero population sizes, it specifies the addition of special individuals which seed lineages in the simulated inheritance graph. In our case, we are interested in simulating the infection transmission tree and thus add the instigator of the epidemic, a member of the infected population, as a lineage seed.

<initialState spec='InitState'>
  <populationSize spec='PopulationSize' population='@S' size='199'/>
  <lineageSeed spec='Individual' population='@I'/>
</initialState>

Note that the total population size is the sum of sizes specified in the <populationSize> elements and the number of <lineageSeed> elements.

End condition

You may have noticed that in the <run> element for this calculation we did not specify a fixed simulation time period. As the epidemic will certainly terminate at some finite time due to the nature of the model, we instead specify the following end condition:

<lineageEndCondition spec='LineageEndCondition' nLineages="0"/>

This causes the calculation to terminate once the number of remaining lineages (initially 1) falls to zero.

Output options

As we have included the samplePopulationSizes="true" attribute in the <run> element, we are free to specify a JSON output file as before. This will contain the dynamics of the S, I and R population sizes. In addition, we can specify NewickOutput and NexusOutput types which, respectively, cause the simulated infection transmission tree to be written out to files in those formats. In our case, we have chosen to specify all three.

<output spec='NewickOutput' fileName='SIRTree_output.newick'/>
<output spec='NexusOutput' fileName='SIRTree_output.nexus'/>
<output spec='JsonOutput' fileName='SIRTree_output.json'/>

Note that the NEXUS and Newick tree output formats differ in some non-superficial ways. In particular, as described in the output element description, the Newick format contains only the bare tree structure while the NEXUS format includes node annotations that specify the reactions which result in tree modifications and the populations which carry the lineage above each node in the tree.

Complete input file

With these elements combined, we have the following MASTER simulation specification:

<beast version='2.0' namespace='master:master.model:master.steppers:master.conditions:master.postprocessors:master.outputs'>
  <run spec='InheritanceTrajectory'
       samplePopulationSizes="true">

    <model spec='Model' id='model'>
      <population spec='Population' id='S' populationName='S'/>
      <population spec='Population' id='I' populationName='I'/>
      <population spec='Population' id='R' populationName='R'/>
      
      <reaction spec='Reaction' reactionName="Infection" rate="0.005">
	S + I -> 2I
      </reaction>
      <reaction spec='Reaction' reactionName="Recovery" rate="0.2">
	I -> R
      </reaction>
    </model>
    
    <initialState spec='InitState'>
      <populationSize spec='PopulationSize' population='@S' size='199'/>
      <lineageSeed spec='Individual' population='@I'/>
    </initialState>

    <!-- Simulation will terminate when no lineages remain -->
    <lineageEndCondition spec='LineageEndCondition' nLineages="0"/>
    
    <output spec='NewickOutput' fileName='SIRTree_output.newick'/>
    <output spec='NexusOutput' fileName='SIRTree_output.nexus'/>
    <output spec='JsonOutput' fileName='SIRTree_output.json'/>
  </run>
</beast>

Visualization

The generated NEXUS and Newick files can be easily visualized using the application FigTree. The figure below, for instance, was produced by loading the file SIRTree_output.nexus into v1.4.0 of FigTree, selecting the "polar" tree type, removing the tip labels and using the "reaction" annotation to determine the edge colour. (This causes the edges to be coloured by the specific process that terminates them.)

Additionally, the population sizes can be graphed in R using the approach described in tutorial 1: