Skip to content

bodkan/ku-popgen2023

Repository files navigation

Lecture and exercises on simulations in population genetics at University of Copenhagen 2023

Open your browser-based RStudio here:

on-site participants: http://emily.popgen.dk:3838

remote participants: http://cloud.popgen.dk:3838


You can find the slides here (lecture, exercises, bonus content)

Here is a one-page version (useful for reference while doing the exercises)

Here are solutions to Exercises #1-#3 (#4 is a 'homework')


This README summarizes steps needed to set up your machine for the lecture and exercises. After you're done installing everything, make sure to run a small testing simulation to know that everything works as needed.

For course participants: If you don't want to work on your own laptop, you can also use the online RStudio server provided by the course organizers.


Installation instructions

In case you will be using RStudio (highly recommended), you should do this first. This fixes a questionable default setting of RStudio which not only complicates using Python from R, but can break reproducibility of your analyses.

If you want to use the RStudio server provided for the course

I was told this setup_env() was already ran for every participant. You don't have to do the below step yourself then, but I'm leaving it here regardless in case you run into trouble.

If you want to use the online RStudio server provided by the course organizers, you should run this bit of code in the R console after you log in:

library(slendr)
setup_env(agree = TRUE)

This will automatically install and set up necessary Python modules. The server is quite slow, so the process can easily take five or more minutes!

If you want to use your computer (just in case the server is slow)

You will need:

  • R version 4.x—installators for macOS and Windows are provided here, Linux users will manage.
  • RStudio (not crucial but highly recommended)

Getting slendr to work is critical. The whole lecture is dedicated to this package.

First, run this in your R console:

install.packages("slendr")

Then load slendr itself and set it up by running this bit of code in your R console:

library(slendr)
setup_env(agree = TRUE)

Finally, make sure you get a positive confirmation from the following check:

check_env()

Other R package dependencies

I will use some tidyverse packages for analysis and plotting. You can install them with:

install.packages(c("dplyr", "ggplot2"))"

Testing the setup

Copy the following script to your R session after (!) you successfully installed slendr and ran setup_env() as described above.

library(slendr)
init_env()

o <- population("outgroup", time = 1, N = 100)
b <- population("b", parent = o, time = 500, N = 100)
c <- population("c", parent = b, time = 1000, N = 100)
x1 <- population("x1", parent = c, time = 2000, N = 100)
x2 <- population("x2", parent = c, time = 2000, N = 100)
a <- population("a", parent = b, time = 1500, N = 100)

gf <- gene_flow(from = b, to = x1, start = 2100, end = 2150, rate = 0.1)

model <- compile_model(
  populations = list(a, b, x1, x2, c, o), gene_flow = gf,
  generation_time = 1, simulation_length = 2200
)

ts <- msprime(model, sequence_length = 10e6, recombination_rate = 1e-8)

ts

If this runs without error and you get a small summary table from the ts object, you're all set!

Workaround for an RStudio bug

RStudio sometimes interferes with Python which is needed for simulations.

Go to ToolsGlobal Options in your RStudio and set the following options like this:


This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

About

Lecture notes and materials on simulations for the 2023 popgen course at Copenhagen University

Resources

Stars

Watchers

Forks