Skip to content

The purpose of this project was to analyze a fictitious auto manufacturing company's production problems. There are troubles that are blocking the manufacturing team's progress. R is used to perform multiple linear regression analysis, collect summary statistics, run t-tests and design a statistical study to compare vehicle performance.

JeremyKRay/MechaCar_Statistical_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MechaCar_Statistical_Analysis

Overview of the Analysis

The purpose of this project was to analyze production problems at Mechacar, a fictitious auto manufacturing company. There are troubles that are blocking the manufacturing team's progress and analysis of the company's manufacturing data is needed to determine where the company can improve. R is used to perform the analysis and it is separated into 3 main components.

A) A multiple linear regression analysis is performed to identify which variables in the dataset predict the mpg of MechaCar prototypes.

B) Summary statistics on the pounds per square inch (PSI) of the suspension coils from the manufacturing lots.

C) T-tests are run to determine if manufacturing lots are statistically different from the mean population.

Finally, a statistical study is designed to compare vehicle performance against vehicles from other manufacturers. A summary interpretation of the findings for each statistical analysis is included.

Technology Used

download download download

Results

Linear Regression to Predict MPG

Using R script, I created a linear regression model with summary statistics to determine what variables affect mpg. The dataset I analyzed is a production dataset for AutosRUs' newest prototype, MechaCar.
The results are shown below.

Linear Regression.png

Linear Regression and Summary.png

The statistical analysis provided 3 main findings.

1.) According to our results, vehicle length and ground clearance (as well as intercept) are statistically likely to provide non random amounts of variance to the linear regression model. In other words the vehicle length and ground clearance have a significant impact on mpg. Conversely, according to our results, vehicle weight, spoiler angle and AWD are likely to provide random amounts of variance to our model and therefore do not have a significant impact on mpg. The intercept is also statistically significant, meaning there is a significant amount of variability in the dependent variable when all independent vairables are equal to zero. Depending on our dataset, a significant intercept could mean that the significant features (such as vehicle length and ground clearance) may need scaling or transforming to help improve the predictive power of the model. Alternatively, it may mean that there are other variables that can help explain the variability of our dependent variable that have not been included in our model. Depending on the dataset and desired performance of the model, we may want to change our independent variables and/or transform them and then re-evaluate our coefficients and significance.

2.) The slope of the linear model is not considered to be zero. Assuming a significance level of 0.05, a p-value of 5.35e-11 is much much lower than this. This means we can reject the null Hypothesis, which also means that the slope of our linear model is not zero.

3.) Despite the number of significant variables, this multiple linear regression model does predict mpg of MechaCar prototypes effectively. According to the summary output, the r-squared value of the model is 0.71, meaning there is a strong positive correlation between our variables, and the p-value remained significant, well below the .05 significance level.

Summary Statistics on Suspension Coils

The design specifications for the MechaCar suspension coils dictate that the variance of the suspension coils must not exceed 100 pounds per square inch. According to the statistics shown for the Total Summary, the variance across the suspension coils is 62.29 psi, which is below the 100 psi standard, even with a standard deviation of 7.9. This meets the design specification when looking at all lots. However, when examining the lot summary, it becomes apparent that Lot 3 should be further investigated. Its variance of 170.29 is well above the 100 psi standard and the standard deviation for this data set is also much higher than the other two lots. With the other two lot's variances coming in at .98 and 7.47, it is very evident that Lot 3 should be further investigated. Please see the results of the analysis below

Total Summary.png

Lot Summary

T-Tests on Suspension Coils

Please see the results below of the t-tests performed in order to determine if all manufacturing lots and each lot individually are different from the population mean of 1,500 pounds per square inch. A brief explanation of what each t-test is demonstrating is included below the results.

All Lots

The t-test performed on all of the lots shows that as a whole, considering all lots, they are not significantly different than the population mean of 1500 psi for suspension coils. The p-value is .06, above our standard of .05, and the mean is 1498.78.

Lot 1

The t-test performed on Lot 1 shows that Lot 1 is not significantly different than the population mean of 1500 psi for suspension coils. The p-value is 1 and the mean is exactly 1500.

Lot 2

The t-test performed on Lot 2 shows that Lot 2 is not significantly different than the population mean of 1500 psi for suspension coils. The p-value is .6072, above our standard of .05, and the mean is 1500.2.

Lot 3

The t-test performed on Lot 3 shows that Lot 3 is significantly different than the population mean of 1500 psi for suspension coils. The p-value is .04168, below our standard of .05, and the mean is 1496.14.

Summary

The above statistical analysis has revealed a very important statistical variance among the lots for the MechaCar prototype. The pounds per square inch of the suspension coils in cars from Lot 3 are significantly different from the cars of the other 2 lots. So, before moving forward with designing a study comparing MechaCar vs the competition, I would suggest either AutosRUs not include Lot 3 in the study, or fix the issue with the suspension coils to bring the cars closer to the statistical mean.

Study Design: MechaCar vs Competition

Description

Now that we have a statistically significant population of MechaCars to study, it is time to see how it might stack up to the competition. I plan to perform the following statistical study to determine if cost, fuel efficiency and safety rating have an effect on the consumer rating. Then, using the same metrics, predict what consumer rating the MechaCar can expect and how that compares to the competition.

Metrics

Cost Fuel efficiency

  • City
  • Highway Safety Rating Consumer Rating

Null Hypothesis

Cost, Fuel efficiency, and Safety Rating do not have a statistically significant impact on consumer rating.

Alternative Hypothesis

Cost, Fuel efficiency, and Safety Rating do have a statistically significant impact on consumer rating.

Statistical Test

Multiple Linear Regression to determine the variance of the dependent variable, consumer rating, and the independent variables, cost, fuel efficiency and safety rating.

Data

Cost, fuel efficiency and Safety rating data for the MechaCar and the competition.

Consumer reports data on the competition

About

The purpose of this project was to analyze a fictitious auto manufacturing company's production problems. There are troubles that are blocking the manufacturing team's progress. R is used to perform multiple linear regression analysis, collect summary statistics, run t-tests and design a statistical study to compare vehicle performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages