Skip to content

pratikunterwegs/pathomove

Repository files navigation

Source code for Pathomove, an individual-based model for the evolution of animal movement strategies under the risk of pathogen transmission

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-CMD-check DOI Codecov test coverage

This repository holds the source code for the Pathomove simulation, a spatially explicit, individual-based, evolutionary model of the evolution of animal social movement strategies under the risk of pathogen transmission.

The model is written by Pratik Gupte, in the Modelling Adaptive Response Mechanisms Group (Weissing Lab) at the Groningen Institute for Evolutionary Life Science, at the University of Groningen.

The source code for analyses of this simulation's output can be found on Github at https://github.com/pratikunterwegs/patho-move-evol, from where a link to an archived version on Zenodo should be available.

Contact and Attribution

Please contact Pratik Gupte for questions on the model or the associated project.

Name: Pratik Rajan Gupte
Email: pratikgupte16@gmail.com OR p.r.gupte@rug.nl
ORCID: https://orcid.org/0000-0001-5294-7819

You can cite all versions by using the DOI 10.5281/zenodo.6331815. This DOI represents all versions, and will always resolve to the latest one.

Simulation model

Please refer to the preprint on biorXiv for a full description of the model, and the biological system it aims to simulate.

This model ties together a number of different concepts:

  1. Mechanistic modelling of the evolution of animal movement decisions, following a framework from an earlier model (Gupte et al. 2021).

  2. Exploitation competition for discrete food items distributed in continuous space, rather than on a grid.

  3. The introduction and spread of an infectious pathogen between agents when they are close together. The pathogen causes a chronic 'disease', which reduces net energy, and hence fitness.


Simulation methods

The model combines a number of interesting tools to implement its conceptual components:

  1. Familiarity in R, speed in C++ The simulation is written in C++, but disguised as easy-to-use R functions. Rcpp is used to link the two. The model runs with a single function, run_pathomove.

  2. R functions return R objects Simulation results from run_pathomove are returned to R as well known objects (lists and data.frames).

  3. Fast and efficient distance calculations Distances between agent pairs, and between agents and food items, are calculated many hundreds of thousands of times, using boost Rtrees.

  4. Speed boosts using TBB multi-threading Internal simulation functions are sped up using Intel's Thread Building Blocks (TBB) library, which is conveniently included with RcppParallel, making it cross-platform.

  5. Testing Rcpp functions The internal C++ functions underlying the main simulation code (run_pathomove) are tested (some!) using Catch testing, which is integrated with the R package testthat.


Running the model

The model is bundled as an R package, which means it can be installed and run out-of-the-box on most systems with minimal effort. The actual model code is written in C++; users need not interact with this code.

Pre-requisites

The package depends on the following R packages, which are installed by default alongside it.

  1. Rcpp to link R and C++, and to export simulation data directly as R lists and data.frames.

  2. RcppParallel for multi-threading using Intel's Thread Building Blocks (TBB).

  3. BH to make the Boost.Geometry headers available to the Rcpp package. The package does not explicitly link to system Boost installations (though this is possible).

RcppParallel on Windows: An Important Note

The contents of the src/Makevars.win script need to be copied to the Makevars.win script for your local installation of R on Windows, usually Documents/.R/. This helps the package find the TBB libraries provided by RcppParallel. This is not necessary on Linux systems.

Copy this to Documents/.R/Makevars.win on Windows systems only

CXX_STD = CXX14
PKG_CXXFLAGS += -DRCPP_PARALLEL_USE_TBB=1
PKG_LIBS += $(shell "${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe" -e "RcppParallel::RcppParallelLibs()")

Installation

  1. Clone the repository using SSH by running git clone git@github.com:pratikunterwegs/pathomove.git.

  2. In R, build the package using devtools::build().

  3. In R, install the package using devtools::install().

Alternatively, install the model as an R package directly from R using the code:

devtools::install_github("pratikunterwegs/pathomove")
  1. Try out the model using the script scripts/chk_pkg_install.R. Be warned that specifying a large number of individuals, generations, or timesteps within generations, will take a long time, and may crash on lower-capacity hardware.

  2. Alternatively, run simulation replicates using the R scripts provided in the scripts folder on https://github.com/pratikunterwegs/patho-move-evol.

Usage on different systems

  • Linux and Windows: This package is confirmed to work on both Linux (Ubuntu 20.04+) and Windows (10) systems. This functionality is checked weekly using a Github Actions 'job', the details of which can be found in .github/workflows/R-CMD-check.yaml.

  • Multi-threading: This package uses Intel's TBB library for multi-threading, which substantially improves the speed of the underlying C++ code. This is especially noticeable when running large population sizes, or many generations. This functionality is confirmed to work on both Windows and Linux systems, as above.

    Multi-threading can be turned on for the lone function that uses it, pathomove::run_pathomove, by setting the multithreaded argument to TRUE.

  • High-performance computing clusters: The installation of this package on an Ubuntu-based HPC cluster can be automated by running the shell script provided in the bash/ folder. The example below shows how to install it on the University of Groningen's HPC cluster.

    #!/bin/bash
    # script to install pathomove on the peregrine cluster
    
    ml load R/4.1.0-foss-2021a
    ml load Boost/1.76.0-GCC-10.3.0
    ml load tbb/4.4.2.152
    
    # here working in R
    Rscript --slave -e 'devtools::build()'
    Rscript --slave -e 'sink("install_log.log"); devtools::install(upgrade = "never"); sink()'

    An example of a template job script is provided as bash/main_job_maker.sh.

  • Mass job submission to an HPC cluster: The function use_cluster in R/fun_use_cluster.R can be used to run multiple replicates of this simulation, or multiple parameter combinations; please use this advanced functionality carefully.

  • Multi-threading caveat for high-performance computing clusters: When using (an Ubuntu-based) HPC cluster, multi-threading may not work, even when the cluster has TBB available and loaded. It is not entirely clear why. When using an HPC cluster, set multithreaded = FALSE, to use single-threaded alternatives of multi-threaded functions.

  • MacOS: This package likely does not work on MacOS. This is related to using Intel's TBB library for multi-threading. Users can try to use the single-threaded option at their own risk.


Package documentation

Each function in the package is documented, and this can be accessed through R help, once the package is installed.

?pathomove::run_pathomove()

Alternatively, build the package manual --- a PDF version of the documentation --- after installing the package. A pre-built version of the documentation is provided among the supplementary files in the associated biorXiv submission.

devtools::build_manual(pkg = "pathomove")

Workflow

The workflow to run this model to replicate the results presented in our biorXiv manuscript are described more thoroughly in the Readme of a dedicated repository, https://github.com/pratikunterwegs/patho-move-evol.

A basic working example of how to use this package can be found in the script in the vignettes directory, vignettes/basic_usage.Rmd.

Note: In order to have completely reproducible simulations, it is necessary to run the simulation in single-threaded mode. Multi-threaded simualtion runs are not reproducible.

The basic workflow for the package is:

Local use

  1. Install the package.

  2. Run the following commands.

# run a single replicate with a single combination of parameters
pathomove::run_pathomove(..., multithreaded = TRUE)

Here, '...' indicates the many function arguments, such as population size, landscape size, the number of generations, and when the pathogen is introduced.

multithreaded controls multi-threading to speed up the simulation, TRUE results in automatic use of as many threads as TBB decides internally.

HPC cluster use

Warnings

Please note: This is an advanced workflow, and should not be attempted lightly.

This workflow describes how to prepare a combination of parameters, and create a job array on an HPC cluster, so that a separate simulation is run for each parameter combination and replicate.

Please note: If any part of this sounds unfamiliar, please stop now, and consider using the simulation locally.

Warning This workflow currently needs to be run from a Linux system, due to issues converting between line-ending types on Windows and Linux systems.

Workflow

  1. Install the package locally.

  2. Install the package on the cluster.

  3. Prepare a directory structure to store the output. A template directory structure can be found at https://github.com/pratikunterwegs/patho-move-evol.

There should be at least the following paths:

yourFolder
├───bash
├───data
│   ├───output
│   ├───parameters
└───scripts
  1. Prepare an R script to actually run the run_pathomove command on the cluster, and to save the output. An example can be found in scripts/do_sim_pathomove.R.
  • You should *prepare this locally, it will be uploaded to the cluster.

  • Make sure that the output path to save the simulation results is in the directory structure shown above; e.g. yourFolder/data/output.

  1. Prepare a template job. An example is found in bash\main_job_maker.sh. This script is written for an Ubuntu-based, SLURM-scheduler HPC cluster.

  2. Run the following commands locally from R.

# this should be your R terminal
# be careful about working directories etc.
# load the package locally
library(pathomove)

# make a parameter file with all the combinations required
# or with multiple replicates
pathomove::make_parameter_file(
  ...,
  replicates = N,
  which_file = "some parameter file name.csv"
)

# above, ... indicates the simulation parameters

# use the use_cluster function to send in jobs
pathomove::use_cluster(
  ssh_con = "ssh connection to your HPC cluster",
  password = "your HPC password", 
  script = "your simulation run script", # e.g. scripts/do_sim_pathomove.R
  folder = "yourFolder", # folder for the output
  scenario_tag = "scenario_tag", # user-defined name for the scenario
  template_job = "template job shell script",  # the shell script from (5)
  parameter_file = "some parameter file name.csv" # the parameter data
)
  1. Simulation output should be returned as Rds files into the data/output folder specified above on the cluster, or your custom equivalent. Move these Rds files to your local system for further analysis.

Please note (again): This is advanced functionality. It is brittle, i.e., it is not tested to work across a range of systems. Please do not attempt this lightly.

About

Rcpp package providing an individual-based model to simulate the evolution of movement strategies under risk of pathogen spread.

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages