Ensemble Kalman Filter optimizing Deep Neural Networks: An alternative approach to non-performing Gradient Descent
This file contains a small description on how to run the accompanying code. Supplied are five python files, a readme and a configuration file:
enkf_pytorch_conv-run.py
: The main file to run the code (see Enkf-optimization).enkf_pytorch.py
: EnKF optimizerconv_net.py
: The Convolutional Network, is also runnable (see gd-optimization).config.json
: Parameter configuration fileREADME.md
: This file, description on how to run the codeplot_accuracy.py
: Plots the test errors (see plotting).plot_grads_acts.py
: Plots the gradient and activation function values
Note: This code example focuses on the MNIST dataset run.
Code:
- numpy>=1.16.0
- pytorch>=1.2.0
For plotting:
- matplotlib>=3.0.0
- seaborn>=0.9.0
See also requirements.txt
The config file config.json
sets the parameters of the method:
root
: The directory where pytorch's dataloader will download and save the MNIST dataset. Default is.
the actual folder.n_ensembles
: Number of ensembles. Default is5000
.gamma
: Scales the identity matrix with a small scalar. Default is0.01
sigma
: Sigma of the Gaussian initialization. Default is0.1
batch_size
: The mini-batch size of the dataset Default is64
.seed
: Sets the seed for Pytorch's and Numpy's random functions. Default is0
.checkpoints
: Number of iterations until a first result is stored, starts with the first iteration. Default is500
.repetitinos
: Number of repetitions the mini-batch is presented to the network in the training phase Default is8
.epochs
: Number of epochs, only for the conv net
All python files and the configuration file should be in the same directory. enkf_pytorch_conv-run.py
is the main python file and can be executed from the terminal with python enkf_pytorch_conv-run.py
. It will read the parameters from the JSON configuration file config.json
. Pytorch's dataloader will download the corresponding MNIST file into the same folder as the scripts are located if not specified otherwise (see Configuration file).
python conv_net.py
will run the Convolutional Network with gradient descent optimization and will test different parameter settings such as different standard deviations for the initialization stds
and two gradient descent optimizers (sgd and adam). Test results will also be stored in the actual folder.
Two plotting scripts with corresponding data are supplied:
plot_accuracy.py
: Plots the test errors, i.e., Figure 1, 5, 8, 10- Figure 1 needs the files
SGD_test_accuracy_ep*.pt
andacc_losses_ep*.pt
to be in folder intest_losses
- Figure 5 needs the file
test_acc.pt
- Figure 8a needs the files
acc_loss.pt
,more_ensembles_acc_loss.pt
,less_ensembles_acc_loss.pt
- Figure 8b needs the files
acc_loss.pt
,relu_acc_loss.pt
,tanh_acc_loss.pt
- Figure 10 needs the file ``
- Figure 1 needs the files
plot_grads_acts.py
: Plots the gradient and activation function values, i.e., Figure 3, 4, 6, 7- Figure 3 needs the file
act_func.npy
- Figure 4 needs the file
gradients.npy
- Figure 6 needs the files
gradients_ep*.npy
- Figure 7 need the files
act_func_ep*.npy
- Figure 9 needs the files
dyn_change.pt
andacc_loss.pt
- Figure 3 needs the file
All data files can be downloaded here The contents of the folder can be merged into the code folder.