Data Science Theories And Methods

This project goes over data science theories and data preprocessing, leveraging the Titanic data set provided by Kaggle. The goal is to determine which passengers will likely survive or perish the monumental tragedy. The binary classification problem was addressed using two methods, each with three machine learning algorithms.

The first approach taken was a classical one where the training and testing sets were split manually. The second was to use the split data sets given without any unnecessary manipulations. Applied in both methods respectively, the Logit model (With and without gradient descent), Random Forests, and Support Vector Machines. Results showed that when we use the given partitioning, accuracy rates are close to 100%. In contrast, if we address the problem using the classical method we see an accuracy of approximately 85%.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Conventional Approach.R		Conventional Approach.R
Implementing Gradient Descent.R		Implementing Gradient Descent.R
README.md		README.md
Using Split Sets Given By Kaggle.R		Using Split Sets Given By Kaggle.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conventional Approach.R

Conventional Approach.R

Implementing Gradient Descent.R

Implementing Gradient Descent.R

README.md

README.md

Using Split Sets Given By Kaggle.R

Using Split Sets Given By Kaggle.R

Repository files navigation

Data Science Theories And Methods

About

Releases

Packages

Languages

mselias/A-Challenge-of-Titanic-Proportions

Folders and files

Latest commit

History

Repository files navigation

Data Science Theories And Methods

About

Topics

Resources

Stars

Watchers

Forks

Languages