Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use line function to test linear regression #15

Open
crazylk opened this issue Jan 10, 2019 · 2 comments
Open

use line function to test linear regression #15

crazylk opened this issue Jan 10, 2019 · 2 comments

Comments

@crazylk
Copy link

crazylk commented Jan 10, 2019

use y = 4 + 0.5*x to create test data
but there are no right result.

@crazylk crazylk changed the title use line ruction to test linear regression use line function to test linear regression Jan 10, 2019
@trekhleb
Copy link
Owner

@crazylk could you please share the link to your Jupyter notebook where you're playing with linear regression? I just need to see what training parameters you're using.

@crazylk
Copy link
Author

crazylk commented Jan 11, 2019

Sorry, I use Java and JavaScript. I don't know python so much and don't know how to use Jupyter, so there are simple python file for you. Maybe there are some mistake. Thank you!

import numpy as np
import random
import matplotlib.pyplot as plt
import pandas as pd
from homemade.linear_regression import LinearRegression

def create_data(in_count):
x_train_list = []
y_train_list = []
x_test_list = []
y_test_list = []
for i in range(0, in_count):
x = 0.0 + i
y = 4 + 0.5 * i
if random.randint(1, 101) > 5:
x_train_list.append(x)
y_train_list.append(y)
else:
x_test_list.append(x)
y_test_list.append(y)

x_train_list = np.array(x_train_list).reshape(len(x_train_list), 1)
y_train_list = np.array(y_train_list).reshape(len(y_train_list), 1)
x_test_list = np.array(x_test_list).reshape(len(x_test_list), 1)
y_test_list = np.array(y_test_list).reshape(len(y_test_list), 1)
return x_train_list, y_train_list, x_test_list, y_test_list

(x_train, y_train, x_test, y_test) = create_data(100)

plt.scatter(x_train, y_train, label='Training Dataset')
plt.scatter(x_test, y_test, label='Test Dataset')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Line function:y = 4 + 0.5*x')
plt.legend()
plt.show()

Set up linear regression parameters.

num_iterations = 50000 # Number of gradient descent iterations.
regularization_param = 0 # Helps to fight model overfitting.
learning_rate = 0.01 # The size of the gradient descent step.
polynomial_degree = 0 # The degree of additional polynomial features.
sinusoid_degree = 0 # The degree of sinusoid parameter multipliers of additional features.

Init linear regression instance.

linear_regression = LinearRegression(x_train, y_train, polynomial_degree, sinusoid_degree)

Train linear regression.

(theta, cost_history) = linear_regression.train(
learning_rate,
regularization_param,
num_iterations
)

Print training results.

print('Initial cost: {:.2f}'.format(cost_history[0]))
print('Optimized cost: {:.2f}'.format(cost_history[-1]))

Print model parameters

theta_table = pd.DataFrame({'Model Parameters': theta.flatten()})
theta_table.head()

Get model predictions for the trainint set.

predictions_num = 1300
x_predictions = np.linspace(x_train.min(), x_train.max(), predictions_num).reshape(predictions_num, 1);
y_predictions = linear_regression.predict(x_predictions)

Plot training data with predictions.

plt.scatter(x_train, y_train, label='Training Dataset')
plt.scatter(x_test, y_test, label='Test Dataset')
plt.plot(x_predictions, y_predictions, 'r', label='Prediction')
plt.xlabel('Economy..GDP.per.Capita.')
plt.ylabel('Happiness.Score')
plt.title('Countries Happines')
plt.legend()
plt.show()

train_cost = linear_regression.get_cost(x_train, y_train, regularization_param)
test_cost = linear_regression.get_cost(x_test, y_test, regularization_param)

print('Train cost: {:.2f}'.format(train_cost))
print('Test cost: {:.2f}'.format(test_cost))

test_predictions = linear_regression.predict(x_test)

test_predictions_table = pd.DataFrame({
'x': x_test.flatten(),
'Test Value': y_test.flatten(),
'Predicted Value': test_predictions.flatten(),
'Prediction Diff': (y_test - test_predictions).flatten()
})

test_predictions_table.head(10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants