https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/
https://www.tableau.com/learn/articles/what-is-data-cleaning
- Missing Values
- Remove outliers???
https://www.kaggle.com/code/pmarcelino/comprehensive-data-exploration-with-python/notebook https://www.kaggle.com/code/dgawlik/house-prices-eda/notebook https://www.kaggle.com/code/datafan07/beginner-eda-with-feature-eng-and-blending-models https://www.kaggle.com/code/vipin20/house-prices-eda-feature-engineering#EDA
- remove imbalanced data
- reduce skewness:
- choose ones which has obvious relationship with target in boxplot
- Boxplot to represent the relationship between target and categorical predictors:https://www.kaggle.com/code/datafan07/beginner-eda-with-feature-eng-and-blending-models
- Use one-hot encoding to transform categorical data,
- After transforming categorical into numerical values, we calculate the correlations matrix. https://www.kaggle.com/code/vipin20/house-prices-eda-feature-engineering#EDA
- ??
- ??
- ??
- Model Performance
- Feature Importance