SaddleFreeOptimizer

A second order optimizer for TensorFlow that uses the Saddle-Free method of Dauphin et al. (2014) with some modifications.

Algorithm

The algorithm is described by Dauphin, et al. (2014). The implementation here follows this paper with the following exceptions:

The order of operations in the Lanczos method follows that recommended by Paige (1972).
The type of damping applied to the curvature matrix in the Krylov subspace has 3 options that can be specified in the optimizer's constructor.
Instead of applying multiple damping coefficients and finding the result with the lowest loss, this implementation uses a Marquardt-style heuristic to update the damping coefficient as per Martens (2010).
If you choose a Krylov dimension that is larger than the number of parameters in the model, then the algorithm will not perform the Lanczos method; it will essentially become a Levenberg-Marquardt method with multiple options for damping and a custom loss function. Obviously, this can only be done with very small models such as the XOR_Test example.

Files

SFOptimizer.py is the optimizer class.
mnist/dataset.py is a utility class from https://github.com/tensorflow/models.git used to obtain MNIST data.
XOR_Test.ipynb is a Jupyter notebook containing a simple network trained to an XOR function.
AE_Test.ipynb is a Jupyter notebook containing a deep autoencoder network trained with MNIST data.

Implementation Notes

The Lanczos iteration loop is unrolled into branches in the TensorFlow graph. This allows a full step to be taken in one TF operation. However, it means the graph can get large if you use a high Krylov dimension.
As in the original paper, no re-orthogonalization is used for the Lanczos vectors. This means that they will likely become linearly dependent if the Krylov dimension is high (> 100?). There would, thus, be little benefit in attempting this.
Tested with Python 3.6.7 and TensorFlow 1.12.0

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
mnist		mnist
.gitignore		.gitignore
AE_Test.ipynb		AE_Test.ipynb
LICENSE		LICENSE
README.md		README.md
SFOptimizer.py		SFOptimizer.py
XOR_Test.ipynb		XOR_Test.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mnist

mnist

.gitignore

.gitignore

AE_Test.ipynb

AE_Test.ipynb

LICENSE

LICENSE

README.md

README.md

SFOptimizer.py

SFOptimizer.py

XOR_Test.ipynb

XOR_Test.ipynb

Repository files navigation

SaddleFreeOptimizer

Algorithm

Files

Implementation Notes

About

Releases

Packages

Languages

License

dave-fernandes/SaddleFreeOptimizer

Folders and files

Latest commit

History

Repository files navigation

SaddleFreeOptimizer

Algorithm

Files

Implementation Notes

About

Topics

Resources

License

Stars

Watchers

Forks

Languages