GitHub - joyce-lin/Project_House_Price_Prediction: Predicting House Sales Prices Using Advanced Regression Techniques

Predicting House Sales Prices Using Advanced Regression Techniques

This project was based on a Kaggle competition: https://www.kaggle.com/c/house-prices-advanced-regression-techniques

The goal for this project is to practice on feature engineering, data EDA, using machine learning regression techniques to select best prediction model for home sales prices.

The dataset I used consisted of 79 variables describing features of homes in Ames, Iowa.

I used Numpy, Pandas, and Seaborn plots for exploratory data analysis and data cleaning.

From the scikit-learn library, I built a series of grid-searched k-neighbors, tree-based, linear regression models to try to improve my predictive accuracy.

My best model which gives the highest test score is "Random Forest Regressor" with below parameters:

      bootstrap=True, criterion='mse', max_depth=None,
       max_features='auto', max_leaf_nodes=None,
       min_impurity_split=1e-07, min_samples_leaf=1,
       min_samples_split=2, min_weight_fraction_leaf=0.0,
       n_estimators=28, n_jobs=1, oob_score=False, random_state=42

I've conducted a Train_Score: 0.97292543997 Test_Score: 0.888000886093 on the "train.csv" dataset

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
ipynb		ipynb
lib		lib
pickled		pickled
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

ipynb

ipynb

lib

lib

pickled

pickled

README.md

README.md

Repository files navigation

Predicting House Sales Prices Using Advanced Regression Techniques

About

Releases

Packages

Languages

joyce-lin/Project_House_Price_Prediction

Folders and files

Latest commit

History

Repository files navigation

Predicting House Sales Prices Using Advanced Regression Techniques

About

Topics

Resources

Stars

Watchers

Forks

Languages