Implemention of supervised statistical learning algorithms for predtion of house price. It is a regression task and for that I have used Linear Regression along with other algorithms to predict the housing price using the ParisHousing.csv dataset.
The "ParisHousing.csv" dataset is collected from Kaggle. Dataset is available here: ParisHousing.csv
It is a supervised learning algorithm which uses statistical method to make predictions for continuous/real or numerical values. This algorithm is used for predicting sales, salary, weather, age, product price etc. More on Linear Regression JavaPoint, GeeksforGeeks.
Accuracy: 99.99% (0.999999561980484)
Basically a linear regression but models relationship between dependent and independent variables as n-th degree polynomial.
Accuracy: For 2nd Degree: 0.9999995497190701, For 3rd Degree: 0.9999994956324605
It is a support vector machine algorithm for regression tasks like this.
Accuracy: 99.23% (0.9923266599052655)
Decision tree algorithm for regression tasks.
Accuracy: 99.99% (0.9999962088679886)
Random Forest algorithm for regression analysis.
Accuracy: 99.99% (0.9999981046318548)
Linear regression has shown the best accuracy for this dataset. After the LR model, Random Forest Regressor has shown the highest accuracy among other algorithms. The RF is a powerful algorithm for performing both regression and classification problems. Still LR outperformed RF for this regression problem. Another observation is that, as the degree of of polynomial increases the accuracy decreases and it becomes more and more costly.
NB: None of these models are optimized. I have used the defaults values that are preset. The data_handle.ipynb is for data preprocessing. I have used the well-known Pandas library for this task. Adter preprocessing I have spilted the dataset into a traing and testing file for making the job easy.