This repository contains and data and scripts for reproducing the results accompanying the manuscript
Brian Lee1, Muhammad Saqib Sohail2, Elizabeth Finney1, Syed Faraz Ahmed2, Ahmed Abdul Quadeer2, Matthew R. McKay2,3,4,5 and John P. Barton1,#
1 Department of Physics and Astronomy, University of California, Riverside
2 Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology
3 Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology
4 Department of Electrical and Electronic Engineering, University of Melbourne
5 Department of Microbiology and Immunology, University of Melbourne, at The Peter Doherty Institute for Infection and Immunity
# correspondence to john.barton@ucr.edu
The preprint is available at https://www.medrxiv.org/content/10.1101/2021.12.31.21268591.
- Branching process simulations: A notebook to generate and analyze simulation is given in
simulations.ipynb
, and the scripts for generating and analyzing simulations is given in thesimulation-scripts/
directory. - SIR simulations: The folder
SIR
contains MATLAB files for running and analyzing two different multi-variant SIR simulations. This folder contains its own readme file with instructions on how to use the files. - Data processing: A notebook for processing and analyzing SARS-CoV-2 sequence data is given in
data-paper.ipynb
. Scripts for analyzing and processing the data are given in thedata_processing.py
module and theprocessing-files/
directory. Due to the number of SARS-CoV-2 genomes, much of the data analysis is best run on a computer cluster. We have provided code for producing the necessary job files in thedata-paper.ipynb
notebook. The original sequence data and metadata can be downloaded from GISAID. - Figures: A notebook for generating the figures found in the paper or the supplementary material is given in
figures.ipynb
. Modules for generating the figures in the paper are given infigs.py
andmplot.py
, while figures in the supplementary data can be produced using theepi_figs.py
module.
Parts of the analysis are implemented in C++11 and the GNU Scientific Library.
This repository is dual licensed as GPL-3.0 (source code) and CC0 1.0 (figures, documentation, and our presentation of the data).