Impact of Orthogonal Initialization in Deep Learning

Dynamical Isometry as a Consequence of Weight Orthogonality

Ester Hlav, 2019

How does orthogonal initialization of weight matrices help improve the training of neural networks? What happens if we further impose orthogonality during training? We research the effect of dynamical isometry and its positive impact on convergence during training.

What is Dynamical Isometry?

Dynamical Isometry happens when the singular values of the input-output Jacobian for weight matrices equal one. When the Jacobian J is well-conditioned, i.e. its eigenvalues are equal to one, then J is a norm-preserving-mapping, the mean of the Spectral density of J becomes one and dynamical isometry is reached. When a neural network achieves dynamical isometry, the gradient avoids the chaotic (exploding gradient) as well as ordered (vanishing gradient) zone, which triggers better and faster convergence.

Research Questions

Effect of Orthogonal Initialization on:
A) Vanishing and Exploding Gradient
B) Difference of Speed of Convergence between Deep and Shallow Neural Networks
C) Accuracy of Non-Linear Neural Networks with vs without Dynamic Isometry

Effect of Orthogonal Regularization:
D) Can excessive orthogonality constraint (i.e. hard regularization) hurt performance?
E) Can specific conditions (e.g. depth) enforce orthogonality in a more beneficial way than others?

Empirical Results

While the first part of the project researched the mathematical consequences of dynamical isometry, in the second half we conduct experiments for recurrent neural networks (RNNs) and impose an orthogonal regularization constraint with gain on training. While we report for some datasets non-conclusive results of orthogonal regularization, our results on Sequential MNIST dataset report that a gain-adjusted regularizer outperforms a single-soft regularizer.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Dynamical_Isometry_paper.pdf		Dynamical_Isometry_paper.pdf
Dynamical_Isometry_slides.pdf		Dynamical_Isometry_slides.pdf
OrthogonalRegRNNSV.ipynb		OrthogonalRegRNNSV.ipynb
README.md		README.md
ordered_chaotic.png		ordered_chaotic.png
rt-polarity.neg		rt-polarity.neg
rt-polarity.pos		rt-polarity.pos
singular_values.png		singular_values.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

Dynamical_Isometry_paper.pdf

Dynamical_Isometry_paper.pdf

Dynamical_Isometry_slides.pdf

Dynamical_Isometry_slides.pdf

OrthogonalRegRNNSV.ipynb

OrthogonalRegRNNSV.ipynb

README.md

README.md

ordered_chaotic.png

ordered_chaotic.png

rt-polarity.neg

rt-polarity.neg

rt-polarity.pos

rt-polarity.pos

singular_values.png

singular_values.png

Repository files navigation

Impact of Orthogonal Initialization in Deep Learning

Dynamical Isometry as a Consequence of Weight Orthogonality

Ester Hlav, 2019

What is Dynamical Isometry?

Research Questions

Empirical Results

About

Releases

Packages

Languages

EsterHlav/Dynamical-Isometry-from-Orthogonality-Neural-Nets

Folders and files

Latest commit

History

Repository files navigation

Impact of Orthogonal Initialization in Deep Learning

Dynamical Isometry as a Consequence of Weight Orthogonality

Ester Hlav, 2019

What is Dynamical Isometry?

Research Questions

Empirical Results

About

Topics

Resources

Stars

Watchers

Forks

Languages