Multivariate-Polynomial-Regression

Multivariate Polynomial Regression using gradient descent.

In this assignment, polynomial regression models of degrees 1,2,3,4,5,6 have been developed for the 3D Road Network (North Jutland, Denmark) Data Set using gradient descent method. R squared, RMSE and Squared Error values have been calculated and compared for each model to find the models which best fit the data, as well as ones which overfit the data. L1 and L2 regularisation has been implemented to explore the effect of regularisation on testing loss and overfitting.

Dataset

Number of Instances: 43487
Number of Attributes: 4

Attributes:

OSM_ID: OpenStreetMap ID for each road segment or edge in the graph.
LONGITUDE: (Google format) longitude
LATITUDE: (Google format) latitude
ALTITUDE: Height in meters.

The first attribute(OSM_ID) has been dropped. LONGITUDE and LATITUDE values have been used to predict the target variable, ALTITUDE.

Structure

The code is divided into two files, generate_polynomials.py and polynomial_regression.py.

generate_polynomials.py is used to calculate polynomial terms for each degree. For instance, the degree 2 model is of the form:

The generate_polynomials.py file will calculate the terms

polynomial_regression.py implements gradient descent for the 6 models which minimises the loss function:

$E= (1/(2*N))\sum_{i=0}^{N} ((w0+ w1x1 + w2x2+...) - Y)^2$

Gradient Descent

For each model, the training error was plotted for each iteration. It is clear that the error drops with each iteration. The following figure shows the plot of training error for degree 3 model

R Squared, RMSE and Squared-error

R Squared and RMSE was computed for each model.

It follows that up till degree 3, the testing error drops with increasing degree, but increasing degree there after results in an increase in error. This suggests that the degree 3 model best fits the data, where as models of degree 4, 5 and 6 are overfitting the data. The increasing average absolute values of weights with increasing degree also suggests that the weights are assuming arbitrarily large values to fit the data.

Regularisation

To address the problem of overfitting, L1 and L2 regularisation has been implemented for the degree 6 model. The following figures show the effect of regularisation on testing error.

Regularisation results in a sharp decrease in testing error. In fact, the loss for degree 6 polynomial model with regularisation is comparable with the loss for degree 1,2,3 and 4 polynomial models without regularisation.

Average absolute weight decreases sharply for the models with regularisation. Once regularised, the ws aren’t assuming large values to cause the model to oscillate wildly and overfit the data.

Instructions for executing:

Run python polynomial_regression.py to build models for degrees 1 through 6,generate comparative graphs for R Squared, RMSE and Sqaured Error, using gradient descent with and without regularisation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
3D_spatial_network.txt		3D_spatial_network.txt
README.md		README.md
X_1.npy		X_1.npy
X_2.npy		X_2.npy
X_3.npy		X_3.npy
X_4.npy		X_4.npy
X_5.npy		X_5.npy
X_6.npy		X_6.npy
average_weight_degree_6_regularisation.png		average_weight_degree_6_regularisation.png
degree1.png		degree1.png
degree2.png		degree2.png
degree3.png		degree3.png
degree4.png		degree4.png
degree5.png		degree5.png
degree6.png		degree6.png
generate_polynomials.py		generate_polynomials.py
polynomial_regression.py		polynomial_regression.py
r2.png		r2.png
r2_degree_6_regularisation.png		r2_degree_6_regularisation.png
rmse.png		rmse.png
rmse_degree_6_regularisation.png		rmse_degree_6_regularisation.png
squared_error.png		squared_error.png
squared_error_degree_6_regularisation.png		squared_error_degree_6_regularisation.png
y.npy		y.npy

prathmachowksey/Multivariate-Polynomial-Regression

Folders and files

Latest commit

History

Repository files navigation

Multivariate-Polynomial-Regression

Dataset

Attributes:

Structure

Gradient Descent

R Squared, RMSE and Squared-error

Regularisation

Instructions for executing:

About

Topics

Resources

Stars

Watchers

Forks

Languages