Skip to content

jvpon/Intro-to-Statistics-in-R-2017

Repository files navigation

Intro-to-Statistics-in-R-2017

This repository contains Rmd and html files for hands-on practicums given at CRG in 2016-2017.


MODULE I. Descriptive Statistics & Intro to Probability.

  1. Descriptive statistics
  • Explore data
    • Exercise with table(), cut(), and quantile()
    • Looking at subset of data
    • Missing (NA) values
    • Exercise on missing values
  • Summary statistics
    • Outliers
  • Plots
    • Box-plot
    • Histogram
    • Exercise with hist()
    • Scatterplot
    • Combining plots
    • Empirical cumulative distribution functions (eCDFs)
    • Exercise with eCDF
  1. Distributions
  • Normal distribution
    • Probability Density function (PDF)
    • Z-transformation of a random sample from a normal distribution
    • Cumulative Distribution Function (CDF)
    • The 68-95-99.7 rule
    • Quantile function, or How to obtain the critical values of -z and z for a specified area under the standard normal curve.

MODULE II. Statistical Inference. Parametric tests.

  • Parametric tests
    • One-sample test on the sample mean for the random sample drawn from the normally distributed population with known variance: z-test
    • One-sample test on the sample mean for the sample with unknown variance: t-test
    • Two-sample paired and unpaired tests with unknown variance: t-test
    • Test for proportions: prop.test()
    • Fisher's exact test on proportions
  • Confidence intervals and t-distribution

Module III. Statistical Inference. False Discovery Rate. Power analysis. Part 1.

  • FWER, FDR
  • Power analysis
  • Sample size estimation
    • Sample size case study 1: Central Tendency (means) difference
    • Sample size case study 2: Central Tendency (means) difference, less noisy
    • Sample size case study 3: Proportions

Module III. Statistical Inference. Power analysis. Part 2. (re-iteration of Part 1 with more exercises)

  • Sample size estimation
  • Power of tests
    • Types of errors
    • Power of the one-sample t test
    • Power of the two-sample t test
    • Calculating power using the package “pwr”

MODULE IV. Statistical inference. Non-parametric tests. Data transformation.

  • QQ-plot
  • Tests on normality
  • Data transformation
  • Non-parametric tests

MODULE V. Statistical modeling. Regression. ANOVA.

  • Problems on linear regression
  • ANOVA
    • Examine the data
    • One-way ANOVA
    • Post-hoc tests
    • Test assumptions
    • Two-way ANOVA
  • Data transformation
    • Normality
    • Variance stabilization
    • Box-Cox
  • Regression
    • Plots of residuals vs. fitted values
    • Regression: interpretation
    • Comparison of methods
    • Coordinates transformation