Skip to content
This repository has been archived by the owner on Feb 28, 2018. It is now read-only.

Sigmoid derivative in nn.js #22

Open
funderburkjim opened this issue Jan 7, 2018 · 3 comments
Open

Sigmoid derivative in nn.js #22

funderburkjim opened this issue Jan 7, 2018 · 3 comments

Comments

@funderburkjim
Copy link

As I read it, your formula for the derivative of the sigmoid function is wrong in nn.js.
You have

NeuralNetwork.dSigmoid = function(x) {
  return x * (1 - x);
}

but it should be

NeuralNetwork.dSigmoid = function(x) {
  var y = NeuralNetwork.sigmoid(x);
  return y* (1 - y);
}

Reference

The style and substance of your 'Coding Train' material is very enjoyable. 👍 Thank you.

@funderburkjim
Copy link
Author

In order that the train method works properly with the sigmoid activation function and the corrected dSigmoid , two lines in the train method need to be changed.

old: 
var gradient_output = Matrix.map(outputs, this.derivative);
var gradient_hidden = Matrix.map(hidden_outputs, this.derivative);

new:
var gradient_output = Matrix.map(output_inputs, this.derivative);
var gradient_hidden = Matrix.map(hidden_inputs, this.derivative);

This is tricky. Your original code gave the correct answer with a sigmoid activation function,
but would, I think fail with the tanh activation function; this is because the relation between the sigmoid function and its derivative is different than that between the tanh function and its derivative.

@funderburkjim
Copy link
Author

Some other small suggestions:

  1. To test out the above, I used the 'copy' method of NeuralNetwork. In the course of this, I noticed error
    at line 59: this.lr = this.lr which should be this.lr = nn.lr.

  2. Your naming of variables follows those of the Book quite closely except at one point in train.
    You use variable names output_inputs and outputs where the book uses
    names final_inputs and final_outputs. It would make it slightly easier to compare your
    code to the book if you used the book variable names here also.

  3. At lines 77,78 of nn.js, you use the randomize method from the Matrix object in matrix.js.

this.wih.randomize();
    this.who.randomize();

Matrix.randomize() uses a function uses a function from p5.js:

this.matrix[i][j] = randomGaussian();

To follow the book more closely, I replaced this (in nn.js) with

      this.wih.nn_randomize_uniform(-0.5,0.5);
      this.who.nn_randomize_uniform(-0.5,0.5);

and added the method nn_randomize_uniform in matrix.js:

Matrix.prototype.nn_randomize_uniform(min,max) {
	// Written for nn.js  (ejf)
	// This follows the book.
	// Note: Alternatively, p.133 'Optional: More Sophisticated Weights'
 	for (var i=0;i<this.rows;i++) {
	    for (var j=0;j<this.cols;j++) {
		this.matrix[i][j] = (Math.random() * (max - min))  + min;
	    }
	}
     }

This has the virtue of following the book's default method, and also removing the dependence of nn.js
on p5.js.

Hope you don't mind these minor nitpicks!

@shiffman
Copy link
Owner

This is wonderful, thank you so much for this detailed set of comments! I'm in the process of creating the video tutorials that correspond to this code so I'll work on adding these in as I go!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants