Skip to content

eminsin/Learn-Statistical-and-Regression-Analysis

Repository files navigation

Learn Statistical and Regression Analysis

Description

Author

Erkam Minsin

Reference

Code

  • Scripted on: Notepad++
  • Executed on: R 4.1.1 and R Studio
  • Dependent on : Standard R libraries, UsingR and MASS libraries

Contents

  • Instructions on installing R and R Studio
  • Some R essentials
    • Using R as a calculator
    • Assignment
    • Using c() to enter data
    • Using functions on a data vector
    • Creating structured data
  • Accessing data by using indices
    • Assigning values to data vector
    • Logical values
    • Missing values
    • Managing the work environment
  • Reading in other sources of data
    • Using R's built-in libraries and data sets
    • Using the data sets that accompany this book
    • Other methods of data entry
  • Categorical Data
    • Tables
    • Barplots
    • Pie Charts
    • Factors
  • Numeric Data
    • Stem-and-leaf plots
    • Strip Charts
    • The center:mean,median,and mode
    • Variation:the variance,standard deviation,and IQR
  • Shape of a distribution
    • Histogram
    • Modes,symmetry, and skew
    • Box plots
  • Pairs of categorical variables
    • Making two-way tables from summarized data
    • Making two-way tables from unsummarized data
    • Marginal distributions of two-way tables
    • Conditional distributions of two-way tables
    • Graphical summaries of two-way contingency tables
  • Comparing independent samples
    • Side-by-side boxplots
    • Density plots
    • Strip charts
    • Quantile-quantile plots
  • Relationships in numeric data
    • Using scatterplots to investigate relationships
    • The correlation between two variables
  • Simple Linear Regression
    • Using the regression model for prediction
    • Finding the regression coefficients using lm()
    • Transformations of the data
    • Interacting with a scatterplot
    • Outliers in the regression model
    • Resistant regression lines: lqs() and rlm()
    • Trend lines
  • Viewing multivariate data
    • Summarizing categorical data
    • Comparing independent samples
    • Comparing relationships
  • R basics: data frames and lists
    • Creating a data frame or list
    • Accessing values in a data frame
    • Setting values in a data frame or list
    • Applying functions to a data frame or list
  • Using model formula with multivariate data
    • Boxplots from a model formula
    • The plot() function with model formula
    • Creating contingency tables with xtabs()
    • Manipulating data frames: split() and stack()
  • Lattice graphics
  • Types of data in R
    • Factors
    • Coercion of objects
  • Populations
    • Discrete random variables
    • Continuous random variables
    • Sampling from a population
    • Sampling distributions
  • Families of distributions
    • Binomial, normal, and some other named distributions
    • Popular distributions to describe populations
    • Sampling distributions
  • The central limit theorem
    • Normal parent population
    • Nonnormal parent population
  • The normal approximation for the binomial
  • for loops
  • Simulations related to the central limit theorem
  • Defining a function
    • Editing a function
    • Function arguments
    • The function body
  • Investigating distributions
    • Script files and source()
    • The geometric distribution
  • Bootstrap samples
  • Alternates to for loop
  • Confidence interval ideas
    • Finding confidence intervals using simulation
  • Confidence intervals for a population proportion, p
    • Using prop.test() to find confidence intervals
  • Confidence intervals for the population mean, mu
    • One-sided confidence intervals
  • Other confidence intervals
  • Confidence intervals for differences
    • Difference of proportions
    • Difference of means
    • Matched samples
  • Confidence intervals for the median
    • Confidence intervals based on the binomial
    • Confidence intervals based on signed-rank statistic
    • Confidence intervals based on the rank-sum statistic
  • Significance test for a population proportion
    • Using prop.test() to compute p-values
  • Significance test for the mean (t-tests)
  • Significance tests and confidence intervals
  • Significance tests for the median
    • The sign test
    • The signed-rank test
  • Two-sample tests of proportion
  • Two-sample tests of center
    • Two sample tests of center with normal populations
    • Matched samples
    • The Wilcoxon rank-sum test for equality of center
  • The chi-squared goodness-of-fit test
    • The multinomial distribution
    • Pearson's chi-squared statistic
  • The chi-squared test of independence
    • The chi-squared test of homogeneity
  • Goodness-of-fit tests for continuous distributions
    • Kolmogorov-Smirnov test
    • The Shapiro-Wilk test for normality
    • Finding parameter values using fitdistr()
  • The simple linear regression model
    • Model formulas for linear models
    • Examples of the linear model
    • Estimating the parameters in simple linear regression
    • Using lm() to find the estimates
  • Statistical inference for simple linear regression
    • Testing the model assumptions
    • Statistical inferences
    • Using lm() to find values for a regression model
  • Multiple linear regression
    • Fitting the multiple regression model using lm()
    • Interpreting the regression parameters
    • Statistical inferences
    • Model selection
  • One-way ANOVA
    • Using R's model formulas to specify ANOVA models
    • Using oneway.test() to perform ANOVA
    • Using aov() for ANOVA
    • The nonparametric Kruskal-Wallis test
  • Using lm() for ANOVA
    • Treatment coding for analysis of variance
    • Comparing multiple differences
  • ANCOVA
  • Two-way ANOVA
    • Treatment coding for additive two-way ANOVA
    • Testing for row or column effects
    • Testing for interactions
  • Logistic regression
    • Generalized linear models
    • Fitting the model using glm()
  • Nonlinear models
    • Fitting nonlinear models with nls()