Skip to content

christophe-pouzat/YRLS2017

Repository files navigation

YRLS2017: R for statistics and reproducible research in the life sciences

R and RR (Reproducible Research) course material for the YRLS 2017 meeting.

Course material

The course, R for statistics and reproducible research in the life sciences, introduces the following topics:

  • What's reproducible research?
  • What is R
  • Why R? R vs Python / matlab / Java / C etc
  • How to get R
  • Learning R
  • R syntax and use basics
  • A simple example of reproducible analysis with R

Two RMarkdown files (with .Rmd extension) contain the main part of the course (Pouzat_YRLS_20170516.Rmd) and an actual, short (an not simple enough!) RR application (Pouzat_YRLS_RR_20170516.Rmd). The HTML output for both of these files are also included.

To regenerate the HTML outputs from the source files you need first to install the rmarkdown package. This is done within R with:

install.packages("rmarkdown")

Once this is done, start R in the directory where the two .Rmd were downloaded and type:

library(rmarkdown)
rmarkdown::render("Pouzat_YRLS_20170516.Rmd")

to regenerate Pouzat_YRLS_20170516.html, then you have to install rhdf5, and:

rmarkdown::render("Pouzat_YRLS_RR_20170516.Rmd")

will regenerate Pouzat_YRLS_RR_20170516.html.

Questions and Answers

Here are few questions that came up at the end of the course and some (tentative) answers.

R and Excel

Modeling "at large" A question came up about general modeling strategies or "how does one go from data to models?". A tricky question! There are no general rule I know of but the issue is touched upon in Philipp K. Janert book Data Analysis with Open Source Tools (mentioned in the course) in chapters 7 to 11 (part II) as well as in his (excellent) book on gnuplot: Gnuplot in Action. Look at part IV of the book.