PolicyRenewals (work in progress)

Data preparation, EDA and modeling on an imbalanced dataset from: https://www.kaggle.com/arashnic/imbalanced-data-practice?select=aug_train.csv

In this exercise: Explore Deep Learning and Random Forests with the h2o ML package, Apply explainable AI (XAI) methods with DALEX package, Work with SMOTE and ROSE methods for upsampling and PCA, Further improve my data wrangling skills with dplyr, Perform automatic feature engineering, Fit, evaluate and compare caret models, Create more advanced ggplot2 graphics for EDA, Play around with tidyquant (excel-like functions, e.g. pivot tables),

Despite the repository name, we build a model to predict whether the policyholders (Health Insurance) from past year will also be interested in Vehicle Insurance provided by the company. [Kaggle]

Comments: Overall, this exercise would be more interesting, if there were more information on the variables, e.g. the regions, or the sales channels. In that case, some well informed and justified feature engineering could be performed. A similar point could be made when it comes to visual presentation.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.gitignore		.gitignore
README.md		README.md
Renewals.R		Renewals.R
functions.R		functions.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

Renewals.R

Renewals.R

functions.R

functions.R

Repository files navigation

PolicyRenewals (work in progress)

About

Releases

Packages

Languages

Karol-Gawlowski/PolicyRenewals

Folders and files

Latest commit

History

Repository files navigation

PolicyRenewals (work in progress)

About

Topics

Resources

Stars

Watchers

Forks

Languages