Skip to content

salauddintapu/House_Price_Prediction

Repository files navigation

House_Price_Prediction

Implemention of supervised statistical learning algorithms for predtion of house price. It is a regression task and for that I have used Linear Regression along with other algorithms to predict the housing price using the ParisHousing.csv dataset.

Dataset

The "ParisHousing.csv" dataset is collected from Kaggle. Dataset is available here: ParisHousing.csv

Linear Regression

It is a supervised learning algorithm which uses statistical method to make predictions for continuous/real or numerical values. This algorithm is used for predicting sales, salary, weather, age, product price etc. More on Linear Regression JavaPoint, GeeksforGeeks.
Accuracy: 99.99% (0.999999561980484)

Polynomial Regression

Basically a linear regression but models relationship between dependent and independent variables as n-th degree polynomial.
Accuracy: For 2nd Degree: 0.9999995497190701, For 3rd Degree: 0.9999994956324605

Support Vector Regressor

It is a support vector machine algorithm for regression tasks like this.
Accuracy: 99.23% (0.9923266599052655)

Decision Tree Regressor

Decision tree algorithm for regression tasks.
Accuracy: 99.99% (0.9999962088679886)

Random Forest Regressor

Random Forest algorithm for regression analysis.
Accuracy: 99.99% (0.9999981046318548)

Discussion

Linear regression has shown the best accuracy for this dataset. After the LR model, Random Forest Regressor has shown the highest accuracy among other algorithms. The RF is a powerful algorithm for performing both regression and classification problems. Still LR outperformed RF for this regression problem. Another observation is that, as the degree of of polynomial increases the accuracy decreases and it becomes more and more costly.

NB: None of these models are optimized. I have used the defaults values that are preset. The data_handle.ipynb is for data preprocessing. I have used the well-known Pandas library for this task. Adter preprocessing I have spilted the dataset into a traing and testing file for making the job easy.

Releases

No releases published

Packages

No packages published